Notes on group ICA

2022-05-15

Group ICA is a popular multivariate unsupervised learning method for population neuroimaging data (e.g. fMRI, ERP, EEG) based on independent component analysis (ICA). It performs ICA on the population data after two data reductions – first at subject level, then at population level.

Suppose there are a total of $M$ subjects with fMRI. Let $Y_i$ denote subject $i$ ’s preprocessed and spatially normalized fMRI data of dimension $K \times V$ , where $K$ is the number of time points, and $V$ is the dimension of the brain. The first reduction at subject level is to use PCA to reduce the dimension while keeping the maximum possible variance retained, and we have

$$X_i = F_i^{-1} Y_i$$

where

$F_i^{-1} \in \mathbb{R}^{L \times K}$ : reducing matrix determined by PCA
$X_i \in \mathbb{R}^{L \times V}$ : reduced data matrix of subject $i$

After the reduction at subject level, we concatenate the dimension-reduced data from all subjects and perform another reduction with PCA. What we have after the second data reduction is $X = G^{-1} \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix}$ , where $G^{-1} \in \mathbb{R}^{N \times LM}$ is the data reducing matrix determined by PCA with respect to the concatenated population data shared by the $M$ subjects.

After these two data reduction steps, ICA is then applied to find the unmixing matrix $\hat{A} \in \mathbb{R}^{N \times N} $ such that $X = \hat{A} \hat{S}$ , where $\hat{S} \in \mathbb{R}^{N \times V}$ correspondes to the $N$ spatially independent brain (BOLD) activation maps shared by the population.

We then can project the shared components back to each subject’s space by utilizing the relationship $G \hat{A} \hat{S} = \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix}$ . If partition the matrix $G$ into subject-wise, we have $\begin{bmatrix} G_1 \\ G_2 \\ \cdots \\ G_M \end{bmatrix} \hat{A} \hat{S} = \begin{bmatrix}F_1^{-1} Y_1 \\ F_2^{-2} Y_2 \\ \cdots \\ F_M^{-1} Y_M \end{bmatrix} \Rightarrow G_i \hat{A} \hat{S}_i = F_i^{-1} Y_i$ , where $\hat{S}_i \in \mathbb{R}^{L \times V}$ is the single-subject spatial map (subject $i$ ) and thus can be computed from $\hat{S}_i = (G_i \hat{A})^{-1} F_i^{-1} Y_i$ . We also have $Y_i \approx F_i G_i \hat{A} \hat{S}_i$ .

In this derivation, the data reductions are performed with respect to the time dimension, and ICA is to unmix spatial components. This is the so-called spatial ICA. It is a more popular choice compared to temporal ICA, where the reductions are done in the spatial dimension, though the two are “merely two different modeling assumptions” (Calhoun et al., 2009).

References

Calhoun, V. D., Adali, T., Pearlson, G. D., & Pekar, J. J. (2001). A method for making group inferences from functional MRI data using independent component analysis. Human brain mapping, 14(3), 140-151.
Calhoun, V. D., Liu, J., & Adalı, T. (2009). A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data. Neuroimage, 45(1), S163-S172.
再读经典:《用于脑电数据的独立成分分析》