Notes on group ICA
2022-05-15
Group ICA is a popular multivariate unsupervised learning method for population neuroimaging data (e.g. fMRI, ERP, EEG) based on independent component analysis (ICA). It performs ICA on the population data after two data reductions – first at subject level, then at population level.
Suppose there are a total of $M$
subjects with fMRI. Let $Y_i$
denote subject $i$
’s preprocessed and spatially normalized fMRI data of dimension $K \times V$
, where $K$
is the number of time points, and $V$
is the dimension of the brain. The first reduction at subject level is to use PCA to reduce the dimension while keeping the maximum possible variance retained, and we have
$$X_i = F_i^{-1} Y_i$$
where
$F_i^{-1} \in \mathbb{R}^{L \times K}$
: reducing matrix determined by PCA$X_i \in \mathbb{R}^{L \times V}$
: reduced data matrix of subject$i$
After the reduction at subject level, we concatenate the dimension-reduced data from all subjects and perform another reduction with PCA. What we have after the second data reduction is $X = G^{-1} \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix}$
, where $G^{-1} \in \mathbb{R}^{N \times LM}$
is the data reducing matrix determined by PCA with respect to the concatenated population data shared by the $M$
subjects.
After these two data reduction steps, ICA is then applied to find the unmixing matrix $\hat{A} \in \mathbb{R}^{N \times N} $
such that $X = \hat{A} \hat{S}$
, where $\hat{S} \in \mathbb{R}^{N \times V}$
correspondes to the $N$
spatially independent brain (BOLD) activation maps shared by the population.
We then can project the shared components back to each subject’s space by utilizing the relationship $G \hat{A} \hat{S} = \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix}$
. If partition the matrix $G$
into subject-wise, we have $\begin{bmatrix} G_1 \\ G_2 \\ \cdots \\ G_M \end{bmatrix} \hat{A} \hat{S} = \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix} \Rightarrow G_i \hat{A} \hat{S}_i = F_i^{-1} Y_i$
, where $\hat{S}_i \in \mathbb{R}^{L \times V}$
is the single-subject spatial map (subject $i$
) and thus can be computed from $\hat{S}_i = (G_i \hat{A})^{-1} F_i^{-1} Y_i$
. We also have $Y_i \approx F_i G_i \hat{A} \hat{S}_i$
.
In this derivation, the data reductions are performed with respect to the time dimension, and ICA is to unmix spatial components. This is the so-called spatial ICA. It is a more popular choice compared to temporal ICA, where the reductions are done in the spatial dimension, though the two are “merely two different modeling assumptions” (Calhoun et al., 2009).
References
-
Calhoun, V. D., Adali, T., Pearlson, G. D., & Pekar, J. J. (2001). A method for making group inferences from functional MRI data using independent component analysis. Human brain mapping, 14(3), 140-151.
-
Calhoun, V. D., Liu, J., & Adalı, T. (2009). A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data. Neuroimage, 45(1), S163-S172.