Tensor reshaping

In multilinear algebra, a reshaping of tensors is any bijection between the set of indices of an order-${\displaystyle d}$ tensor and the set of indices of an order-${\displaystyle \ell }$ tensor, where ${\displaystyle \ell . The use of indices presupposes tensors in coordinate representation with respect to a basis. The coordinate representation of a tensor can be regarded as a multi-dimensional array, and a bijection from one set of indices to another therefore amounts to a rearrangement of the array elements into an array of a different shape. Such a rearrangement constitutes a particular kind of linear map between the vector space of order-${\displaystyle d}$ tensors and the vector space of order-${\displaystyle \ell }$ tensors.

Definition

Given a positive integer ${\displaystyle d}$, the notation ${\displaystyle [d]}$ refers to the set ${\displaystyle \{1,\dots ,d\}}$ of the first d positive integers.

For each integer ${\displaystyle k}$ where ${\displaystyle 1\leq k\leq d}$ for a positive integer ${\displaystyle d}$, let Vk denote an nk-dimensional vector space over a field ${\displaystyle F}$. Then there are vector space isomorphisms (linear maps)

{\displaystyle {\begin{aligned}V_{1}\otimes \cdots \otimes V_{d}&\simeq F^{n_{1}}\otimes \cdots \otimes F^{n_{d}}\\&\simeq F^{n_{\pi _{1}}}\otimes \cdots \otimes F^{n_{\pi _{d}}}\\&\simeq F^{n_{\pi _{1}}n_{\pi _{2}}}\otimes F^{n_{\pi _{3}}}\otimes \cdots \otimes F^{n_{\pi _{d}}}\\&\simeq F^{n_{\pi _{1}}n_{\pi _{3}}}\otimes F^{n_{\pi _{2}}}\otimes F^{n_{\pi _{4}}}\otimes \cdots \otimes F^{n_{\pi _{d}}}\\&\,\,\,\vdots \\&\simeq F^{n_{1}n_{2}\ldots n_{d}},\end{aligned}}}

where ${\displaystyle \pi \in {\mathfrak {S}}_{d}}$ is any permutation and ${\displaystyle {\mathfrak {S}}_{d}}$ is the symmetric group on ${\displaystyle d}$ elements. Via these (and other) vector space isomorphisms, a tensor can be interpreted in several ways as an order-${\displaystyle \ell }$ tensor where ${\displaystyle \ell \leq d}$.

Coordinate representation

The first vector space isomorphism on the list above, ${\displaystyle V_{1}\otimes \cdots \otimes V_{d}\simeq F^{n_{1}}\otimes \cdots \otimes F^{n_{d}}}$, gives the coordinate representation of an abstract tensor. Assume that each of the ${\displaystyle d}$ vector spaces ${\displaystyle V_{k}}$ has a basis ${\displaystyle \{v_{1}^{k},v_{2}^{k},\ldots ,v_{n_{k}}^{k}\}}$. The expression of a tensor with respect to this basis has the form

${\displaystyle {\mathcal {A}}=\sum _{j_{1}=1}^{n_{1}}\ldots \sum _{j_{d}=1}^{n_{d}}a_{j_{1},j_{2},\ldots ,j_{d}}v_{j_{1}}^{1}\otimes v_{j_{2}}^{2}\otimes \cdots \otimes v_{j_{d}}^{d},}$
where the coefficients ${\displaystyle a_{j_{1},j_{2},\ldots ,j_{d}}}$ are elements of ${\displaystyle F}$. The coordinate representation of ${\displaystyle {\mathcal {A}}}$ is
${\displaystyle \sum _{j_{1}=1}^{n_{1}}\ldots \sum _{j_{d}=1}^{n_{d}}a_{j_{1},j_{2},\ldots ,j_{d}}\mathbf {e} _{j_{1}}^{1}\otimes \mathbf {e} _{j_{2}}^{2}\otimes \cdots \otimes \mathbf {e} _{j_{d}}^{d},}$
where ${\displaystyle \mathbf {e} _{j}^{k}}$ is the ${\displaystyle j^{\text{th}}}$ standard basis vector of ${\displaystyle F^{n_{k}}}$. This can be regarded as a d-dimensional array whose elements are the coefficients ${\displaystyle a_{j_{1},j_{2},\ldots ,j_{d}}}$.

Vectorization

By means of a bijective map ${\displaystyle \mu :[n_{1}]\times \cdots \times [n_{d}]\to [n_{1}\cdots n_{d}]}$, a vector space isomorphism between ${\displaystyle F^{n_{1}}\otimes \cdots \otimes F^{n_{d}}}$ and ${\displaystyle F^{n_{1}\cdots n_{d}}}$ is constructed via the mapping ${\displaystyle \mathbf {e} _{j_{1}}^{1}\otimes \cdots \otimes \mathbf {e} _{j_{d}}^{d}\mapsto \mathbf {e} _{\mu (j_{1},j_{2},\ldots ,j_{d})},}$ where for every natural number ${\displaystyle j}$ such that ${\displaystyle 1\leq j\leq n_{1}\cdots n_{d}}$, the vector ${\displaystyle \mathbf {e} _{j}}$ denotes the jth standard basis vector of ${\displaystyle F^{n_{1}\cdots n_{d}}}$. In such a reshaping, the tensor is simply interpreted as a vector in ${\displaystyle F^{n_{1}\cdots n_{d}}}$. This is known as vectorization, and is analogous to vectorization of matrices. A standard choice of bijection ${\displaystyle \mu }$ is such that

${\displaystyle \operatorname {vec} ({\mathcal {A}}):={\begin{bmatrix}a_{1,1,\ldots ,1}&a_{2,1,\ldots ,1}&\cdots &a_{n_{1},1,\ldots ,1}&a_{1,2,1,\ldots ,1}&\cdots &a_{n_{1},n_{2},\ldots ,n_{d}}\end{bmatrix}}^{T},}$

which is consistent with the way in which the colon operator in Matlab and GNU Octave reshapes a higher-order tensor into a vector. In general, the vectorization of ${\displaystyle {\mathcal {A}}}$ is the vector ${\displaystyle [a_{\mu ^{-1}(j)}]_{j=1}^{n_{1}\cdots n_{d}}}$.

General flattenings

For any permutation ${\displaystyle \pi \in {\mathfrak {S}}_{d}}$ there is a canonical isomorphism between the two tensor products of vector spaces ${\displaystyle V_{1}\otimes V_{2}\otimes \cdots \otimes V_{d}}$ and ${\displaystyle V_{\pi (1)}\otimes V_{\pi (2)}\otimes \cdots \otimes V_{\pi (d)}}$. Parentheses are usually omitted from such products due to the natural isomorphism between ${\displaystyle V_{i}\otimes (V_{j}\otimes V_{k})}$ and ${\displaystyle (V_{i}\otimes V_{j})\otimes V_{k}}$, but may, of course, be reintroduced to emphasize a particular grouping of factors. In the grouping,

${\displaystyle (V_{\pi (1)}\otimes \cdots \otimes V_{\pi (r_{1})})\otimes (V_{\pi (r_{1}+1)}\otimes \cdots \otimes V_{\pi (r_{2})})\otimes \cdots \otimes (V_{\pi (r_{\ell -1}+1)}\otimes \cdots \otimes V_{\pi (r_{\ell })}),}$

there are ${\displaystyle \ell }$ groups with ${\displaystyle r_{j}-r_{j-1}}$ factors in the ${\displaystyle j^{\text{th}}}$ group (where ${\displaystyle r_{0}=0}$ and ${\displaystyle r_{\ell }=d}$).

Letting ${\displaystyle S_{j}=(\pi (r_{j-1}+1),\pi (r_{j-1}+2),\ldots ,\pi (r_{j}))}$ for each ${\displaystyle j}$ satisfying ${\displaystyle 1\leq j\leq \ell }$, an ${\displaystyle (S_{1},S_{2},\ldots ,S_{\ell })}$-flattening of a tensor ${\displaystyle {\mathcal {A}}}$, denoted ${\displaystyle {\mathcal {A}}_{(S_{1},S_{2},\ldots ,S_{\ell })}}$, is obtained by applying the two processes above within each of the ${\displaystyle \ell }$ groups of factors. That is, the coordinate representation of the ${\displaystyle j^{\text{th}}}$ group of factors is obtained using the isomorphism ${\displaystyle (V_{\pi (r_{j-1}+1)}\otimes V_{\pi (r_{j-1}+2)}\otimes \cdots \otimes V_{\pi (r_{j})})\simeq (F^{n_{\pi (r_{j-1}+1)}}\otimes F^{n_{\pi (r_{j-1}+2)}}\otimes \cdots \otimes F^{n_{\pi (r_{j})}})}$, which requires specifying bases for all of the vector spaces ${\displaystyle V_{k}}$. The result is then vectorized using a bijection ${\displaystyle \mu _{j}:[n_{\pi (r_{j-1}+1)}]\times [n_{\pi (r_{j-1}+2)}]\times \cdots \times [n_{\pi (r_{j})}]\to [N_{S_{j}}]}$ to obtain an element of ${\displaystyle F^{N_{S_{j}}}}$, where ${\displaystyle N_{S_{j}}:=\prod _{i=r_{j-1}+1}^{r_{j}}n_{\pi (i)}}$, the product of the dimensions of the vector spaces in the ${\displaystyle j^{\text{th}}}$ group of factors. The result of applying these isomorphisms within each group of factors is an element of ${\displaystyle F^{N_{S_{1}}}\otimes \cdots \otimes F^{N_{S_{\ell }}}}$, which is a tensor of order ${\displaystyle \ell }$.

The vectorization of ${\displaystyle {\mathcal {A}}}$ is an ${\displaystyle (S_{1})}$-reshaping, ${\displaystyle {\mathcal {A}}_{(S_{1})}}$ wherein ${\displaystyle S_{1}=(1,2,\ldots ,d)}$.

Matricization

Let ${\displaystyle {\mathcal {A}}\in F^{n_{1}}\otimes F^{n_{2}}\otimes \cdots \otimes F^{n_{d}}}$ be the coordinate representation of an abstract tensor with respect to a basis. A standard factor-k flattening of ${\displaystyle {\mathcal {A}}}$ is an ${\displaystyle (S_{1},S_{2})}$-reshaping in which ${\displaystyle S_{1}=(k)}$ and ${\displaystyle S_{2}=(1,2,\ldots ,k-1,k+1,\ldots ,d)}$. Usually, a standard flattening is denoted by

${\displaystyle {\mathcal {A}}_{(k)}:={\mathcal {A}}_{(S_{1},S_{2})}}$

These reshapings are sometimes called matricizations or unfoldings in the literature. A standard choice for the bijections ${\displaystyle \mu _{1},\ \mu _{2}}$ is the one that is consistent with the reshape function in Matlab and GNU Octave, namely

${\displaystyle {\mathcal {A}}_{(k)}:={\begin{bmatrix}a_{1,1,\ldots ,1,1,1,\ldots ,1}&a_{2,1,\ldots ,1,1,1,\ldots ,1}&\cdots &a_{n_{1},n_{2},\ldots ,n_{k-1},1,n_{k+1},\ldots ,n_{d}}\\a_{1,1,\ldots ,1,2,1,\ldots ,1}&a_{2,1,\ldots ,1,2,1,\ldots ,1}&\cdots &a_{n_{1},n_{2},\ldots ,n_{k-1},2,n_{k+1},\ldots ,n_{d}}\\\vdots &\vdots &&\vdots \\a_{1,1,\ldots ,1,n_{k},1,\ldots ,1}&a_{2,1,\ldots ,1,n_{k},1,\ldots ,1}&\cdots &a_{n_{1},n_{2},\ldots ,n_{k-1},n_{k},n_{k+1},\ldots ,n_{d}}\end{bmatrix}}}$