# Higher-order singular value decomposition

In multilinear algebra, the higher-order singular value decomposition (HOSVD) of a tensor is a specific orthogonal Tucker decomposition. It may be regarded as one generalization of the matrix singular value decomposition. The HOSVD has applications in computer graphics, machine learning, scientific computing, and signal processing. Some key ingredients of the HOSVD can be traced as far back as F. L. Hitchcock in 1928,[1] but it was L. R. Tucker who developed for third-order tensors the general Tucker decomposition in the 1960s,[2][3][4] including the HOSVD. The HOSVD as decomposition in its own right was further advocated by L. De Lathauwer et al.[5] in 2000.

As the HOSVD was studied in many scientific fields, it is sometimes historically referred to as multilinear singular value decomposition, m-mode SVD, or cube SVD, and sometimes it is incorrectly identified with a Tucker decomposition.

## Definition

For the purpose of this article, the abstract tensor ${\displaystyle {\mathcal {A}}}$ is assumed to be given in coordinates with respect to some basis as a multidimensional array, also denoted by ${\displaystyle {\mathcal {A}}}$, in ${\displaystyle F^{n_{1}\times n_{2}\times \cdots \times n_{d}}}$, where d is the order of the tensor and ${\displaystyle F}$ is either ${\displaystyle \mathbb {R} }$ or ${\displaystyle \mathbb {C} }$.

Let ${\displaystyle U_{k}\in F^{n_{k}\times n_{k}}}$be a unitary matrix containing a basis of the left singular vectors of the standard factor-k flattening ${\displaystyle {\mathcal {A}}_{(k)}}$ of ${\displaystyle {\mathcal {A}}}$ such that the jth column ${\displaystyle \mathbf {u} _{j}^{k}}$ of ${\displaystyle U_{k}}$ corresponds to the jth largest singular value of ${\displaystyle {\mathcal {A}}_{(k)}}$. Observe that the factor matrix ${\displaystyle U_{k}}$ does not depend on the particular freedom of choice in the definition of the standard factor-k flattening. By the properties of the multilinear multiplication, we have

${\displaystyle {\begin{array}{rcl}{\mathcal {A}}&=&(I,I,\ldots ,I)\cdot {\mathcal {A}}\\&=&(U_{1}U_{1}^{H},U_{2}U_{2}^{H},\ldots ,U_{d}U_{d}^{H})\cdot {\mathcal {A}}\\&=&(U_{1},U_{2},\ldots ,U_{d})\cdot \left((U_{1}^{H},U_{2}^{H},\ldots ,U_{d}^{H})\cdot {\mathcal {A}}\right),\end{array}}}$
where ${\displaystyle \cdot ^{H}}$ denotes the conjugate transpose. The second equality is because the ${\displaystyle U_{k}}$'s are unitary matrices. Define now the core tensor
${\displaystyle {\mathcal {S}}:=(U_{1}^{H},U_{2}^{H},\ldots ,U_{d}^{H})\cdot {\mathcal {A}}.}$
Then, the HOSVD[5] of ${\displaystyle {\mathcal {A}}}$ is the decomposition
${\displaystyle {\mathcal {A}}=(U_{1},U_{2},\ldots ,U_{d})\cdot {\mathcal {S}}.}$
The above construction shows that every tensor has a HOSVD.

## Compact HOSVD

As in the case the compact singular value decomposition of a matrix, it is also possible to consider a compact HOSVD, which is very useful in applications.

Assume that ${\displaystyle U_{k}\in F^{n_{k}\times r_{k}}}$ is a matrix with unitary columns containing a basis of the left singular vectors corresponding to the nonzero singular values of the standard factor-k flattening ${\displaystyle {\mathcal {A}}_{(k)}}$ of ${\displaystyle {\mathcal {A}}}$. Let the columns of ${\displaystyle U_{k}}$ be sorted such that the jth column ${\displaystyle \mathbf {u} _{j}^{k}}$ of ${\displaystyle U_{k}}$ corresponds to the jth largest nonzero singular value of ${\displaystyle {\mathcal {A}}_{(k)}}$. Since the columns of ${\displaystyle U_{k}}$ form a basis for the image of ${\displaystyle {\mathcal {A}}_{(k)}}$, we have

${\displaystyle {\mathcal {A}}_{(k)}=U_{k}U_{k}^{H}{\mathcal {A}}_{(k)}={\bigl (}(U_{k}U_{k}^{H})\cdot _{k}{\mathcal {A}}{\bigr )}_{(k)},}$
where the first equality is due to the properties of orthogonal projections (in the Hermitian inner product) and the last equality is due to the properties of multilinear multiplication. As flattenings are bijective maps and the above formula is valid for all ${\displaystyle k=1,2,\ldots ,d}$, we find as before that
${\displaystyle {\begin{array}{rcl}{\mathcal {A}}&=&(U_{1}U_{1}^{H},U_{2}U_{2}^{H},\ldots ,U_{d}U_{d}^{H})\cdot {\mathcal {A}}\\&=&(U_{1},U_{2},\ldots ,U_{d})\cdot \left((U_{1}^{H},U_{2}^{H},\ldots ,U_{d}^{H})\cdot {\mathcal {A}}\right)\\&=&(U_{1},U_{2},\ldots ,U_{d})\cdot {\mathcal {S}},\end{array}}}$
where the core tensor ${\displaystyle {\mathcal {S}}}$ is now of size ${\displaystyle r_{1}\times r_{2}\times \cdots \times r_{d}}$.

## Multilinear rank

The tuple ${\displaystyle (r_{1},r_{2},\ldots ,r_{d})\in \mathbb {N} ^{d}}$ where ${\displaystyle r_{k}:=\mathrm {rank} ({\mathcal {A}}_{(k)})}$ is nowadays called the multilinear rank[1] of ${\displaystyle {\mathcal {A}}}$. By definition, every tensor has a unique multilinear rank, and its components are bounded by ${\displaystyle 0\leq r_{k}\leq n_{k}}$. Not all tuples in ${\displaystyle \mathbb {N} ^{d}}$ are multilinear ranks.[6] In particular, it is known that ${\textstyle r_{k}\leq \prod _{j\neq k}r_{j}}$ must hold.[6]

The compact HOSVD is a rank-revealing factorization in the sense that the dimensions of its core tensor correspond with the components of the multilinear rank of the tensor.

## Interpretation

The following geometric interpretation is valid for both the full and compact HOSVD. Let ${\displaystyle (r_{1},r_{2},\ldots ,r_{d})}$ be the multilinear rank of the tensor ${\displaystyle {\mathcal {A}}}$. Since ${\displaystyle {\mathcal {S}}\in F^{r_{1}\times r_{2}\times \cdots \times r_{d}}}$ is a multidimensional array, we can expand it as follows

${\displaystyle {\mathcal {S}}=\sum _{j_{1}=1}^{r_{1}}\sum _{j_{2}=1}^{r_{2}}\cdots \sum _{j_{d}=1}^{r_{d}}s_{j_{1},j_{2},\ldots ,j_{d}}\mathbf {e} _{j_{1}}^{1}\otimes \mathbf {e} _{j_{2}}^{2}\otimes \cdots \otimes \mathbf {e} _{j_{d}}^{d},}$
where ${\displaystyle \mathbf {e} _{j}^{k}}$ is the jth standard basis vector of ${\displaystyle F^{n_{k}}}$. By definition of the multilinear multiplication, it holds that
${\displaystyle {\mathcal {A}}=\sum _{j_{1}=1}^{r_{1}}\sum _{j_{2}=1}^{r_{2}}\cdots \sum _{j_{d}=1}^{r_{d}}s_{j_{1},j_{2},\ldots ,j_{d}}\mathbf {u} _{j_{1}}^{1}\otimes \mathbf {u} _{j_{2}}^{2}\otimes \cdots \otimes \mathbf {u} _{j_{d}}^{d},}$
where the ${\displaystyle \mathbf {u} _{j}^{k}}$ are the columns of ${\displaystyle U_{k}\in F^{n_{k}\times r_{k}}}$. It is easy to verify that ${\displaystyle B=\{\mathbf {u} _{j_{1}}^{1}\otimes \mathbf {u} _{j_{2}}^{2}\otimes \cdots \otimes \mathbf {u} _{j_{d}}^{d}\}_{j_{1},j_{2},\ldots ,j_{d}}}$ is an orthonormal set of tensors. This means that the HOSVD can be interpreted as a way to express the tensor ${\displaystyle {\mathcal {A}}}$ with respect to a specifically chosen orthonormal basis ${\displaystyle B}$ with the coefficients given as the multidimensional array ${\displaystyle {\mathcal {S}}}$.

## Computation

Let ${\displaystyle {\mathcal {A}}\in F^{n_{1}\times n_{2}\times \cdots \times n_{d}}}$, where ${\displaystyle F}$ is either ${\displaystyle \mathbb {R} }$ or ${\displaystyle \mathbb {C} }$, be a tensor of multilinear rank ${\displaystyle (r_{1},r_{2},\ldots ,r_{d})}$.

### Classic computation

The classic strategy for computing a compact HOSVD was introduced in 1966 by L. R. Tucker[2] and further advocated by L. De Lathauwer et al.[5]; it is based on the definition of the decomposition. The next three steps are to be performed:

• For ${\displaystyle k=1,2,\ldots ,d}$, do the following:
1. Construct the standard factor-k flattening ${\displaystyle {\mathcal {A}}_{(k)}}$;
2. Compute the (compact) singular value decomposition ${\displaystyle {\mathcal {A}}_{(k)}=U_{k}\Sigma _{k}V_{k}^{T}}$, and store the left singular vectors ${\displaystyle U_{k}\in F^{n_{k}\times r_{k}}}$;
3. Compute the core tensor ${\displaystyle {\mathcal {S}}}$ via the multilinear multiplication ${\displaystyle {\mathcal {S}}=(U_{1}^{H},U_{2}^{H},\ldots ,U_{d}^{H})\cdot {\mathcal {A}}.}$

### Interlaced computation

A strategy that is significantly faster when some or all ${\displaystyle r_{k}\ll n_{k}}$ consists of interlacing the computation of the core tensor and the factor matrices, as follows[7][8]:

• Set ${\displaystyle {\mathcal {A}}^{0}={\mathcal {A}}}$;
• For ${\displaystyle k=1,2,\ldots ,d}$ perform the following:
1. Construct the standard factor-k flattening ${\displaystyle {\mathcal {A}}_{(k)}^{k-1}}$;
2. Compute the (compact) singular value decomposition ${\displaystyle {\mathcal {A}}_{(k)}^{k-1}=U_{k}\Sigma _{k}V_{k}^{T}}$, and store the left singular vectors ${\displaystyle U_{k}\in F^{n_{k}\times r_{k}}}$;
3. Set ${\displaystyle {\mathcal {A}}^{k}=U_{k}^{H}\cdot _{k}{\mathcal {A}}^{k-1}}$, or, equivalently, ${\displaystyle {\mathcal {A}}_{(k)}^{k}=\Sigma _{k}V_{k}^{T}}$.

## Approximation

In applications, such as those mentioned below, a common problem consists of approximating a given tensor ${\displaystyle {\mathcal {A}}\in F^{n_{1}\times n_{2}\times \cdots \times n_{d}}}$ by one of low multilinear rank. Formally, if ${\displaystyle \mathrm {mlrank} ({\mathcal {A}})}$ denotes the multilinear rank of ${\displaystyle {\mathcal {A}}}$, then the nonlinear non-convex ${\displaystyle \ell _{2}}$-optimization problem is

${\displaystyle \min _{{\mathcal {B}}\in F^{n_{1}\times n_{2}\times \cdots \times n_{d}}}{\frac {1}{2}}\|{\mathcal {A}}-{\mathcal {B}}\|_{F}^{2}\quad {\text{s.t.}}\quad \mathrm {mlrank} ({\mathcal {B}})=(s_{1},s_{2},\ldots ,s_{d}),}$
where ${\displaystyle (s_{1},s_{2},\ldots ,s_{d})\in \mathbb {N} ^{d}}$ with ${\displaystyle 0\leq s_{k}\leq n_{k}}$, is a target multilinear rank that is assumed to be given, and where the norm is the Frobenius norm.

Based on the foregoing algorithms for computing a compact HOSVD, a natural idea for trying to solve this optimization problem is to truncate the (compact) SVD in step 2 of either the classic or the interlaced computation. A classically truncated HOSVD is obtained by replacing step 2 in the classic computation by

• Compute a rank-${\displaystyle s_{k}}$ truncated SVD ${\displaystyle {\mathcal {A}}_{(k)}\approx U_{k}\Sigma _{k}V_{k}^{T}}$, and store the top ${\displaystyle s_{k}}$ left singular vectors ${\displaystyle U_{k}\in F^{n_{k}\times s_{k}}}$;

while a sequentially truncated HOSVD (or successively truncated HOSVD) is obtained by replacing step 2 in the interlaced computation by

• Compute a rank-${\displaystyle s_{k}}$ truncated SVD ${\displaystyle {\mathcal {A}}_{(k)}^{k-1}\approx U_{k}\Sigma _{k}V_{k}^{T}}$, and store the top ${\displaystyle s_{k}}$ left singular vectors ${\displaystyle U_{k}\in F^{n_{k}\times s_{k}}}$;

Unfortunately, neither of these strategies results in an optimal solution of the best low multilinear rank optimization problem[5][7], contrary to the matrix case where the Eckart-Young theorem holds. However, both the classically and sequentially truncated HOSVD result in a quasi-optimal solution[7][8][9]: if ${\displaystyle {\mathcal {B}}_{t}}$ denotes the classically or sequentially truncated HOSVD and ${\displaystyle {\mathcal {B}}^{*}}$ denotes the optimal solution to the best low multilinear rank approximation problem, then

${\displaystyle \|{\mathcal {A}}-{\mathcal {B}}_{t}\|_{F}\leq {\sqrt {d}}\|{\mathcal {A}}-{\mathcal {B}}^{*}\|_{F};}$
in practice this means that if there exists an optimal solution with a small error, then a truncated HOSVD will for many intended purposes also yield a sufficiently good solution.[7]

## Applications

The HOSVD is most commonly applied to the extraction of relevant information from multi-way arrays.

Circa 2001, Vasilescu reframed the data analysis, recognition and synthesis problems as multilinear tensor problems based on the insight that most observed data are the result of several causal factors of data formation, and are well suited for multi-modal data tensor analysis. The power of the tensor framework was showcased in a visually and mathematically compelling manner by decomposing and representing an image in terms of its causal factors of data formation, in the context of Human Motion Signatures,[10] face recognition—TensorFaces[11][12] and computer graphics—TensorTextures.[13]

The HOSVD has been successfully applied to signal processing and big data, e.g., in genomic signal processing.[14][15][16] These applications also inspired a higher-order GSVD (HO GSVD)[17] and a tensor GSVD.[18]

A combination of HOSVD and SVD also has been applied for real-time event detection from complex data streams (multivariate data with space and time dimensions) in disease surveillance.[19]

It is also used in tensor product model transformation-based controller design.[20][21] In multilinear subspace learning,[22] it was applied to modeling tensor objects[23] for gait recognition.

The concept of HOSVD was carried over to functions by Baranyi and Yam via the TP model transformation.[20][21] This extension led to the definition of the HOSVD-based canonical form of tensor product functions and Linear Parameter Varying system models[24] and to convex hull manipulation based control optimization theory, see TP model transformation in control theories.

HOSVD was proposed to be applied to multi-view data analysis[25] and was successfully applied to in silico drug discovery from gene expression.[26]

## References

1. ^ a b Hitchcock, Frank L (1928-04-01). "Multiple Invariants and Generalized Rank of a P-Way Matrix or Tensor". Journal of Mathematics and Physics. 7 (1–4): 39–79. doi:10.1002/sapm19287139. ISSN 1467-9590.
2. ^ a b Tucker, Ledyard R. (1966-09-01). "Some mathematical notes on three-mode factor analysis". Psychometrika. 31 (3): 279–311. doi:10.1007/bf02289464. ISSN 0033-3123.
3. ^ Tucker, L. R. (1963). "Implications of factor analysis of three-way matrices for measurement of change". In C. W. Harris (Ed.), Problems in measuring change. Madison, Wis.: Univ. Wis. Press.: 122–137.
4. ^ Tucker, L. R. (1964). "The extension of factor analysis to three-dimensional matrices". In N. Frederiksen and H. Gulliksen (Eds.), Contributions to mathematical psychology. New York: Holt, Rinehart and Winston: 109–127.
5. ^ a b c d De Lathauwer, L.; De Moor, B.; Vandewalle, J. (2000-01-01). "A Multilinear Singular Value Decomposition". SIAM Journal on Matrix Analysis and Applications. 21 (4): 1253–1278. doi:10.1137/s0895479896305696. ISSN 0895-4798.
6. ^ a b Carlini, Enrico; Kleppe, Johannes. "Ranks derived from multilinear maps". Journal of Pure and Applied Algebra. 215 (8): 1999–2004. doi:10.1016/j.jpaa.2010.11.010.
7. ^ a b c d Vannieuwenhoven, N.; Vandebril, R.; Meerbergen, K. (2012-01-01). "A New Truncation Strategy for the Higher-Order Singular Value Decomposition". SIAM Journal on Scientific Computing. 34 (2): A1027–A1052. doi:10.1137/110836067. ISSN 1064-8275.
8. ^ a b Hackbusch, Wolfgang. Tensor Spaces and Numerical Tensor Calculus | SpringerLink. doi:10.1007/978-3-642-28027-6.
9. ^ Grasedyck, L. (2010-01-01). "Hierarchical Singular Value Decomposition of Tensors". SIAM Journal on Matrix Analysis and Applications. 31 (4): 2029–2054. CiteSeerX . doi:10.1137/090764189. ISSN 0895-4798.
10. ^
11. ^
12. ^
13. ^
14. ^ L. Omberg; G. H. Golub; O. Alter (November 2007). "A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data From Different Studies". PNAS. 104 (47): 18371–18376. Bibcode:2007PNAS..10418371O. doi:10.1073/pnas.0709146104. PMC . PMID 18003902.
15. ^ L. Omberg; J. R. Meyerson; K. Kobayashi; L. S. Drury; J. F. X. Diffley; O. Alter (October 2009). "Global Effects of DNA Replication and DNA Replication Origin Activity on Eukaryotic Gene Expression". Molecular Systems Biology. 5: 312. doi:10.1038/msb.2009.70. PMC . PMID 19888207. Highlight.
16. ^ C. Muralidhara; A. M. Gross; R. R. Gutell; O. Alter (April 2011). "Tensor Decomposition Reveals Concurrent Evolutionary Convergences and Divergences and Correlations with Structural Motifs in Ribosomal RNA". PLoS ONE. 6 (4): e18768. Bibcode:2011PLoSO...618768M. doi:10.1371/journal.pone.0018768. PMC . PMID 21625625. Highlight.
17. ^ S. P. Ponnapalli; M. A. Saunders; C. F. Van Loan; O. Alter (December 2011). "A Higher-Order Generalized Singular Value Decomposition for Comparison of Global mRNA Expression from Multiple Organisms". PLOS ONE. 6 (12): e28072. Bibcode:2011PLoSO...628072P. doi:10.1371/journal.pone.0028072. PMC . PMID 22216090. Highlight.
18. ^ P. Sankaranarayanan; T. E. Schomay; K. A. Aiello; O. Alter (April 2015). "Tensor GSVD of Patient- and Platform-Matched Tumor and Normal DNA Copy-Number Profiles Uncovers Chromosome Arm-Wide Patterns of Tumor-Exclusive Platform-Consistent Alterations Encoding for Cell Transformation and Predicting Ovarian Cancer Survival". PLOS ONE. 10 (4): e0121396. Bibcode:2015PLoSO..1021396S. doi:10.1371/journal.pone.0121396. PMC . PMID 25875127. AAAS EurekAlert! Press Release and NAE Podcast Feature.
19. ^ Hadi Fanaee-T; João Gama (May 2015). "EigenEvent: An algorithm for event detection from complex data streams in Syndromic surveillance". Intelligent Data Analysis. 19 (3). arXiv:. Bibcode:2014arXiv1406.3496F.
20. ^ a b P. Baranyi (April 2004). "TP model transformation as a way to LMI based controller design". IEEE Transaction on Industrial Electronics. 51 (2): 387–400. doi:10.1109/tie.2003.822037.
21. ^ a b P. Baranyi; D. Tikk; Y. Yam; R. J. Patton (2003). "From Differential Equations to PDC Controller Design via Numerical Transformation". Computers in Industry, Elsevier Science. 51: 281–297. doi:10.1016/s0166-3615(03)00058-7.
22. ^ Haiping Lu, K.N. Plataniotis and A.N. Venetsanopoulos, "A Survey of Multilinear Subspace Learning for Tensor Data", Pattern Recognition, Vol. 44, No. 7, pp. 1540–1551, Jul. 2011.
23. ^ H. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, "MPCA: Multilinear principal component analysis of tensor objects," IEEE Trans. Neural Netw., vol. 19, no. 1, pp. 18–39, Jan. 2008.
24. ^ P. Baranyi; L. Szeidl; P. Várlaki; Y. Yam (July 3–5, 2006). Definition of the HOSVD-based canonical form of polytopic dynamic models. Budapest, Hungary. pp. 660–665.
25. ^ Y-h. Taguchi (August 2017). "Tensor decomposition-based unsupervised feature extraction applied to matrix products for multi-view data processing". PLoS ONE. 12 (8): e0183933. Bibcode:2017PLoSO..1283933T. doi:10.1371/journal.pone.0183933. PMC . PMID 28841719.
26. ^ Y-h. Taguchi (October 2017). "Identification of candidate drugs using tensor-decomposition-based unsupervised feature extraction in integrated analysis of gene expression between diseases and DrugMatrix datasets". Scientific Reports. 7: 13733. Bibcode:2017NatSR...713733T. doi:10.1038/s41598-017-13003-0. PMC . PMID 29062063.