Kernel-independent component analysis

In statistics, kernel-independent component analysis (kernel ICA) is an efficient algorithm for independent component analysis which estimates source components by optimizing a generalized variance contrast function, which is based on representations in a reproducing kernel Hilbert space.^[1]^[2] Those contrast functions use the notion of mutual information as a measure of statistical independence.

Main idea

Kernel ICA is based on the idea that correlations between two random variables can be represented in a reproducing kernel Hilbert space (RKHS), denoted by ${\mathcal {F}}$ , associated with a feature map $L_{x}:{\mathcal {F}}\mapsto \mathbb {R}$ defined for a fixed $x\in \mathbb {R}$ . The ${\mathcal {F}}$ -correlation between two random variables $X$ and $Y$ is defined as

\rho _{\mathcal {F}}(X,Y)=\max _{f,g\in {\mathcal {F}}}\operatorname {corr} (\langle L_{X},f\rangle ,\langle L_{Y},g\rangle )

where the functions $f,g:\mathbb {R} \to \mathbb {R}$ range over ${\mathcal {F}}$ and

\operatorname {corr} (\langle L_{X},f\rangle ,\langle L_{Y},g\rangle ):={\frac {\operatorname {cov} (f(X),g(Y))}{\operatorname {var} (f(X))^{1/2}\operatorname {var} (g(Y))^{1/2}}}

for fixed $f,g\in {\mathcal {F}}$ .^[1] Note that the reproducing property implies that $f(x)=\langle L_{x},f\rangle$ for fixed $x\in \mathbb {R}$ and $f\in {\mathcal {F}}$ .^[3] It follows then that the ${\mathcal {F}}$ -correlation between two independent random variables is zero.

This notion of ${\mathcal {F}}$ -correlations is used for defining contrast functions that are optimized in the Kernel ICA algorithm. Specifically, if $\mathbf {X} :=(x_{ij})\in \mathbb {R} ^{n\times m}$ is a prewhitened data matrix, that is, the sample mean of each column is zero and the sample covariance of the rows is the $m\times m$ dimensional identity matrix, Kernel ICA estimates a $m\times m$ dimensional orthogonal matrix $\mathbf {A}$ so as to minimize finite-sample ${\mathcal {F}}$ -correlations between the columns of $\mathbf {S} :=\mathbf {X} \mathbf {A} ^{\prime }$ .

References

^ ^a ^b Bach, Francis R.; Jordan, Michael I. (2003). "Kernel independent component analysis" (PDF). The Journal of Machine Learning Research. 3: 1–48. doi:10.1162/153244303768966085.
^ Bach, Francis R.; Jordan, Michael I. (2003). "Kernel independent component analysis" (PDF). IEEE International Conference on Acoustics, Speech, and Signal Processing. doi:10.1109/icassp.2003.1202783.
^ Saitoh, Saburou (1988). Theory of Reproducing Kernels and Its Applications. Longman. ISBN 0582035643.

This statistics-related article is a stub. You can help Wikipedia by expanding it.

[Bach_Jordan_JMLR_2003-1] Bach, Francis R.; Jordan, Michael I. (2003). "Kernel independent component analysis" (PDF). The Journal of Machine Learning Research. 3: 1–48. doi:10.1162/153244303768966085.

[Bach_Jordan_ICASSP_2003-2] Bach, Francis R.; Jordan, Michael I. (2003). "Kernel independent component analysis" (PDF). IEEE International Conference on Acoustics, Speech, and Signal Processing. doi:10.1109/icassp.2003.1202783.

[Saitoh-3] Saitoh, Saburou (1988). Theory of Reproducing Kernels and Its Applications. Longman. ISBN 0582035643.

[1]

[2]

[3]