# Higher-order singular value decomposition

(Redirected from Tensor decomposition)

A higher-order singular value decomposition (HOSVD) is a singular value decomposition (SVD). It may be referred to as M-mode SVD, or Multilinear SVD. The M-mode SVD and the Tucker Decomposition terminology was coined in the 1980s by P. Kroonenberg, but it was later referred to as Multilinear SVD/HOSVD by L. De Lathauwer. The term HOSVD is a misnomer since in multilinear algebra, there does not exist an SVD method for multi-way arrays (also known as M-way arrays, or informally data tensors) with all the properties of a matrix SVD. A matrix SVD computes both a

(a) a rank-R decomposition and
(b) the orthonormal row/column matrices.

These two capabilities are embodied by two different decompositions for multi-way arrays.

Property (a) may be computed by employing the CANDECOMP/PARAFAC (CP) decomposition. CP is a linear decomposition that represents a tensor as the sum N of rank-1 tensors, for apriori user specified R. This should not be confused with the rank-R decomposition which is the minimum number of R rank-1 tensors that decomposes a tensor exactly. A rank-1 tensor is the outer product of m vectors, where m is the number of the tensor modes.

Property (b) is extended to higher order tensors by a class of methods known variably as Tucker3, M-mode SVD, "Multilinear SVD" or "Higher Order SVD". (This article will use the term "Tucker decomposition".) These methods compute the orthonormal spaces associated with the different modes (axes) of a tensor. The Tucker decomposition is used in multilinear principal component analysis.

Historically, much of the interest in multilinear SVDs was driven by the need to analyze empirical data, especially in psychometrics and chemometrics. Abstract and general mathematical theorems are rare (though see Kruskal[1] with regard to the CP decomposition); instead, the methods are often designed for analyzing specific data types. The 2008 review article by Kolda and Bader[2] provides a compact summary of the history of these decompositions, and many references for further reading.

## Tucker decomposition / M-mode SVD / Multilinear SVD / Higher Order SVD

Main article: Tucker decomposition

The Tucker Decomposition is the Multilinear SVD.

### History

In 1966, L. Tucker proposed a decomposition method for three-way arrays (referred to as a 3-mode "tensors") as a multidimensional extension of factor analysis.[3] This decomposition was further developed in the 1980s by P. Kroonenberg, who coined the terms Tucker3, Tucker3ALS (an alternating least squares dimensionality reduction algorithm), 3-Mode SVD, and 3-Mode PCA.[4] In the intervening years, several authors developed the decomposition for N-way arrays. Most recently, this work was treated in an elegant fashion and introduced to the SIAM community by L. De Lathauwer et al. who referred to the decomposition as a multilinear SVD and HOSVD.[5]

### Definitions

Let the SVD of a real matrix be ${\displaystyle A=USV^{T}}$, then it can be written in an element-wise form as

${\displaystyle a_{i_{1},i_{2}}=\sum _{j_{1}}\sum _{j_{2}}s_{j_{1},j_{2}}u_{i_{1},j_{1}}v_{i_{2},j_{2}}.}$

${\displaystyle U}$ and ${\displaystyle V}$ give, in a certain sense optimal, orthonormal basis for the column and row space, ${\displaystyle S}$ is diagonal with decreasing elements. The higher-order singular value decomposition (HOSVD) can be defined by the multidimensional generalization of this concept:

${\displaystyle a_{i_{1},i_{2},\dots ,i_{N}}=\sum _{j_{1}}\sum _{j_{2}}\cdots \sum _{j_{N}}s_{j_{1},j_{2},\dots ,j_{N}}u_{i_{1},j_{1}}^{(1)}u_{i_{2},j_{2}}^{(2)}\dots u_{i_{N},j_{N}}^{(N)},}$

where the ${\displaystyle U^{(n)}=[u_{i,j}^{(n)}]_{I_{n}\times I_{n}}}$ matrices and the ${\displaystyle {\mathcal {S}}=[s_{j_{1},\dots ,j_{N}}]_{I_{1}\times I_{2}\times \cdots \times I_{N}}}$ core tensor should satisfy certain requirements (similar ones to the matrix SVD), namely

• Each ${\displaystyle U^{(n)}}$ is an orthogonal matrix.
• Two subtensors of the core tensor ${\displaystyle {\mathcal {S}}}$ are orthogonal i.e., ${\displaystyle \langle {\mathcal {S}}_{i_{n}=p},{\mathcal {S}}_{i_{n}=q}\rangle =0}$ if ${\displaystyle p\neq q}$.
• The subtensors in the core tensor ${\displaystyle {\mathcal {S}}}$ are ordered according to their Frobenius norm, i.e. ${\displaystyle \|{\mathcal {S}}_{i_{n}=1}\|\geq \|{\mathcal {S}}_{i_{n}=2}\|\geq \dots \geq \|{\mathcal {S}}_{i_{n}=I_{n}}\|}$ for n = 1, ..., N.

Notation:

${\displaystyle {\mathcal {A}}={\mathcal {S}}\times _{n=1}^{N}U^{(n)}}$

### Algorithm

The HOSVD can be built from several SVDs, as follows:[5] [6]

1. Given a tensor ${\displaystyle {\mathcal {A}}\in \mathbb {R} ^{I_{1}\times I_{2}\times \cdots \times I_{N}}}$, construct the mode-k flattening ${\displaystyle {\mathcal {A}}_{(k)}}$. That is, the ${\displaystyle I_{k}\times (\prod _{j\neq k}I_{j})}$ matrix that corresponds to ${\displaystyle {\mathcal {A}}}$.
2. Compute the singular value decomposition ${\displaystyle {\mathcal {A}}_{(k)}=U_{k}\Sigma _{k}V_{k}^{T}}$, and store the left singular vectors ${\displaystyle U_{k}}$.
3. The core tensor ${\displaystyle {\mathcal {S}}}$ is then the projection of ${\displaystyle {\mathcal {A}}}$ onto the tensor basis formed by the factor matrices ${\displaystyle \{U_{n}\}_{n=1}^{N}}$, i.e., ${\displaystyle {\mathcal {S}}={\mathcal {A}}\times _{n=1}^{N}U_{n}^{T}.}$

## Applications

The HOSVD is most commonly applied to the extraction of relevant information from multi-way arrays.

Circa 2001, Vasilescu reframed the data analysis, recognition and synthesis problems as multilinear tensor problems based on the insight that most observed data are the result of several causal factors of data formation, and are well suited for multi-modal data tensor analysis. The power of the tensor framework was showcased in a visually and mathematically compelling manner by decomposing and representing an image in terms of its causal factors of data formation, in the context of Human Motion Signatures [7] (CVPR 2001, ICPR 2002), face recognition - TensorFaces, [6] [8] (ECCV 2002, CVPR 2003, etc.) and computer graphics—TensorTextures (Siggraph 2004).[9]

The Multilinear SVD has been successfully applied to signal processing and big data, e.g., in genomic signal processing.[10][11][12] These applications also inspired a higher-order GSVD (HO GSVD)[13] and a tensor GSVD.[14]

A combination of HOSVD and SVD also has been applied for real-time event detection from complex data streams (multivariate data with space and time dimensions) in Disease surveillance.[15]

It is also used in tensor product model transformation-based controller design.[16][17] In multilinear subspace learning,[18] it was applied to modeling tensor objects ]][19] for gait recognition.

In machine learning, the CP-decomposition is the central ingredient in learning probabilistic latent variables models via the technique of moment-matching. For example, consider the multi-view model[20] which is a probabilistic latent variable model. In this model, the generation of samples are posited as follows: there exists a hidden random variable that is not observed directly, given which, there are several conditionally independent random variables known as the different "views" of the hidden variable. For simplicity, assume there are three symmetrical views ${\displaystyle x}$ of a ${\displaystyle k}$-state categorical hidden variable ${\displaystyle h}$. Then the empirical third moment of this latent variable model can be written as: ${\displaystyle T=\sum _{i=1}^{k}Pr(h=k)E[x|h=k]^{\otimes 3}}$.

In applications such as topic modeling, this can be interpreted as the co-occurrence of words in a document. Then the eigenvalues of this empirical moment tensor can be interpreted as the probability of choosing a specific topic and each column of the factor matrix ${\displaystyle E[x|h=k]}$ corresponds to probabilities of words in the vocabulary in the corresponding topic.

The concept of Multilinear SVD was carried over to functions by Baranyi and Yam via the TP model transformation [16] .[17] This extension led to the definition of the HOSVD based canonical form of tensor product functions and Linear Parameter Varying system models [21] and to convex hull manipulation based control optimization theory, see TP model transformation in control theories.

## References

1. ^ Kruskal, J. B. (1989). "Rank, decomposition, and uniqueness for 3-way and N-way arrays". In R. Coppi & S. Bolasco (Eds.), Multiway data analysis (pp. 7–18). Amsterdam: Elsevier. [ PDF ].
2. ^ Kolda, Tamara G.; Bader, Brett W. "Tensor Decompositions and Applications". SIAM Rev. 51: 455–500 (46 pages). CiteSeerX . doi:10.1137/07070111X.
3. ^ Ledyard R. Tucker (September 1966). "Some mathematical notes on three-mode factor analysis". Psychometrika. 31 (3): 279–311. doi:10.1007/BF02289464.
4. ^ P. M. Kroonenberg (1983). "Three-mode principal component analysis: Theory and applications". DSWO Press, Leiden.
5. ^ a b Lieven De Lathauwer; Bart De Moor; Joos Vandewalle (April 2000). "A multilinear Singular Value Decomposition". SIAM Journal on Matrix Analysis. 21 (4): 1253–1278. doi:10.1137/s0895479896305696.
6. ^ a b
7. ^
8. ^
9. ^
10. ^ L. Omberg; G. H. Golub; O. Alter (November 2007). "A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data From Different Studies". PNAS. 104 (47): 18371–18376. doi:10.1073/pnas.0709146104. PMC . PMID 18003902.
11. ^ L. Omberg; J. R. Meyerson; K. Kobayashi; L. S. Drury; J. F. X. Diffley; O. Alter (October 2009). "Global Effects of DNA Replication and DNA Replication Origin Activity on Eukaryotic Gene Expression". Molecular Systems Biology. 5: 312. doi:10.1038/msb.2009.70. PMC . PMID 19888207. Highlight.
12. ^ C. Muralidhara; A. M. Gross; R. R. Gutell; O. Alter (April 2011). "Tensor Decomposition Reveals Concurrent Evolutionary Convergences and Divergences and Correlations with Structural Motifs in Ribosomal RNA". PLoS ONE. 6 (4): e18768. doi:10.1371/journal.pone.0018768. Highlight.
13. ^ S. P. Ponnapalli; M. A. Saunders; C. F. Van Loan; O. Alter (December 2011). "A Higher-Order Generalized Singular Value Decomposition for Comparison of Global mRNA Expression from Multiple Organisms". PLOS ONE. 6 (12): e28072. doi:10.1371/journal.pone.0028072. Highlight.
14. ^ P. Sankaranarayanan; T. E. Schomay; K. A. Aiello; O. Alter (April 2015). "Tensor GSVD of Patient- and Platform-Matched Tumor and Normal DNA Copy-Number Profiles Uncovers Chromosome Arm-Wide Patterns of Tumor-Exclusive Platform-Consistent Alterations Encoding for Cell Transformation and Predicting Ovarian Cancer Survival". PLOS ONE. 10 (4): e0121396. doi:10.1371/journal.pone.0121396. PMC . PMID 25875127. AAAS EurekAlert! Press Release and NAE Podcast Feature.
15. ^ Hadi Fanaee-T; João Gama (May 2015). "EigenEvent: An algorithm for event detection from complex data streams in Syndromic surveillance". Intelligent Data Analysis. 19 (3). arXiv:.
16. ^ a b P. Baranyi (April 2004). "TP model transformation as a way to LMI based controller design". IEEE Transaction on Industrial Electronics. 51 (2): 387–400. doi:10.1109/tie.2003.822037.
17. ^ a b P. Baranyi; D. Tikk; Y. Yam; R. J. Patton (2003). "From Differential Equations to PDC Controller Design via Numerical Transformation". Computers in Industry, Elsevier Science. 51: 281–297. doi:10.1016/s0166-3615(03)00058-7.
18. ^ Haiping Lu, K.N. Plataniotis and A.N. Venetsanopoulos, "A Survey of Multilinear Subspace Learning for Tensor Data", Pattern Recognition, Vol. 44, No. 7, pp. 1540–1551, Jul. 2011.
19. ^ H. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, "MPCA: Multilinear principal component analysis of tensor objects," IEEE Trans. Neural Netw., vol. 19, no. 1, pp. 18–39, Jan. 2008.
20. ^ Anandkumar, Animashree; Ge, Rong; Hsu, Daniel; Kakade, Sham M; Telgarsky, Matus (2014). "Tensor decompositions for learning latent variable models". The Journal of Machine Learning Research. 15 (1): 2773–2832.
21. ^ P. Baranyi; L. Szeidl; P. Várlaki; Y. Yam (July 3–5, 2006). Definition of the HOSVD-based canonical form of polytopic dynamic models. Budapest, Hungary. pp. 660–665.