Notes on group ICA
2022-05-15
Group ICA is a popular multivariate unsupervised learning method for population neuroimaging data (e.g. fMRI, ERP, EEG) based on independent component analysis (ICA). It performs ICA on the population data after two data reductions – first at subject level, then at population level.
Suppose there are a total of $M$ subjects with fMRI. Let $Y_i$ denote subject $i$’s preprocessed and spatially normalized fMRI data of dimension $K \times V$, where $K$ is the number of time points, and $V$ is the dimension of the brain. The first reduction at subject level is to use PCA to reduce the dimension while keeping the maximum possible variance retained, and we have
$$X_i = F_i^{-1} Y_i$$
where
$F_i^{-1} \in \mathbb{R}^{L \times K}$: reducing matrix determined by PCA$X_i \in \mathbb{R}^{L \times V}$: reduced data matrix of subject$i$
After the reduction at subject level, we concatenate the dimension-reduced data from all subjects and perform another reduction with PCA. What we have after the second data reduction is $X = G^{-1} \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix}$, where $G^{-1} \in \mathbb{R}^{N \times LM}$ is the data reducing matrix determined by PCA with respect to the concatenated population data shared by the $M$ subjects.
After these two data reduction steps, ICA is then applied to find the unmixing matrix $\hat{A} \in \mathbb{R}^{N \times N} $ such that $X = \hat{A} \hat{S}$, where $\hat{S} \in \mathbb{R}^{N \times V}$ correspondes to the $N$ spatially independent brain (BOLD) activation maps shared by the population.
We then can project the shared components back to each subject’s space by utilizing the relationship $G \hat{A} \hat{S} = \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix}$. If partition the matrix $G$ into subject-wise, we have $\begin{bmatrix} G_1 \\ G_2 \\ \cdots \\ G_M \end{bmatrix} \hat{A} \hat{S} = \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix} \Rightarrow G_i \hat{A} \hat{S}_i = F_i^{-1} Y_i$, where $\hat{S}_i \in \mathbb{R}^{L \times V}$ is the single-subject spatial map (subject $i$) and thus can be computed from $\hat{S}_i = (G_i \hat{A})^{-1} F_i^{-1} Y_i$. We also have $Y_i \approx F_i G_i \hat{A} \hat{S}_i$.
In this derivation, the data reductions are performed with respect to the time dimension, and ICA is to unmix spatial components. This is the so-called spatial ICA. It is a more popular choice compared to temporal ICA, where the reductions are done in the spatial dimension, though the two are “merely two different modeling assumptions” (Calhoun et al., 2009).
References
-
Calhoun, V. D., Adali, T., Pearlson, G. D., & Pekar, J. J. (2001). A method for making group inferences from functional MRI data using independent component analysis. Human brain mapping, 14(3), 140-151.
-
Calhoun, V. D., Liu, J., & Adalı, T. (2009). A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data. Neuroimage, 45(1), S163-S172.