Permutation matrix

In mathematics, particularly in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry of 1 in each row and each column and 0s elsewhere. Each such matrix, say $P$ , represents a permutation of $m$ elements and, when used to multiply another matrix, say $A$ , results in permuting the rows (when pre-multiplying, to form $PA$ ) or columns (when post-multiplying, to form $AP$ ) of the matrix $A$ .

Definition

Given a permutation $π$ of m elements,

\pi :\lbrace 1,\ldots ,m\rbrace \to \lbrace 1,\ldots ,m\rbrace

represented in two-line form by

{\begin{pmatrix}1&2&\cdots &m\\\pi (1)&\pi (2)&\cdots &\pi (m)\end{pmatrix}},

there are two natural ways to associate the permutation with a permutation matrix; namely, starting with the m × m identity matrix, $I m$ , either permute the columns or permute the rows, according to $π$ . Both methods of defining permutation matrices appear in the literature and the properties expressed in one representation can be easily converted to the other representation. This article will primarily deal with just one of these representations and the other will only be mentioned when there is a difference to be aware of.

The m × m permutation matrix P_$π$ = (p_ij) obtained by permuting the columns of the identity matrix $I m$ , that is, for each i, $p ij = 1$ if j = $π$ (i) and $p ij = 0$ otherwise, will be referred to as the column representation in this article.^[1] Since the entries in row i are all 0 except that a 1 appears in column $π$ (i), we may write

P_{\pi }={\begin{bmatrix}\mathbf {e} _{\pi (1)}\\\mathbf {e} _{\pi (2)}\\\vdots \\\mathbf {e} _{\pi (m)}\end{bmatrix}},

where $\mathbf {e} _{j}$ , a standard basis vector, denotes a row vector of length m with 1 in the jth position and 0 in every other position.^[2]

For example, the permutation matrix P_$π$ corresponding to the permutation $\pi ={\begin{pmatrix}1&2&3&4&5\\1&4&2&5&3\end{pmatrix}}$ is

P_{\pi }={\begin{bmatrix}\mathbf {e} _{\pi (1)}\\\mathbf {e} _{\pi (2)}\\\mathbf {e} _{\pi (3)}\\\mathbf {e} _{\pi (4)}\\\mathbf {e} _{\pi (5)}\end{bmatrix}}={\begin{bmatrix}\mathbf {e} _{1}\\\mathbf {e} _{4}\\\mathbf {e} _{2}\\\mathbf {e} _{5}\\\mathbf {e} _{3}\end{bmatrix}}={\begin{bmatrix}1&0&0&0&0\\0&0&0&1&0\\0&1&0&0&0\\0&0&0&0&1\\0&0&1&0&0\end{bmatrix}}.

Observe that the jth column of the $I 5$ identity matrix now appears as the $π$ (j)th column of P_$π$.

The other representation, obtained by permuting the rows of the identity matrix $I m$ , that is, for each j, p_ij = 1 if i = $π$ (j) and $p ij = 0$ otherwise, will be referred to as the row representation.

Properties

The column representation of a permutation matrix is used throughout this section, except when otherwise indicated.

Multiplying $P_{\pi }$ times a column vector g will permute the rows of the vector: $P_{\pi }\mathbf {g} ={\begin{bmatrix}\mathbf {e} _{\pi (1)}\\\mathbf {e} _{\pi (2)}\\\vdots \\\mathbf {e} _{\pi (n)}\end{bmatrix}}{\begin{bmatrix}g_{1}\\g_{2}\\\vdots \\g_{n}\end{bmatrix}}={\begin{bmatrix}g_{\pi (1)}\\g_{\pi (2)}\\\vdots \\g_{\pi (n)}\end{bmatrix}}.$

Repeated use of this result shows that if $M$ is an appropriately sized matrix, the product, $P_{\pi }M$ is just a permutation of the rows of $M$ . However, observing that $P_{\pi }\mathbf {e} _{k}^{\mathsf {T}}=\mathbf {e} _{\pi ^{-1}(k)}^{\mathsf {T}}$ for each $k$ shows that the permutation of the rows is given by $π$ ⁻¹. ( $M^{\mathsf {T}}$ is the transpose of matrix $M$ .)

As permutation matrices are orthogonal matrices (that is, $P_{\pi }P_{\pi }^{\mathsf {T}}=I$ ), the inverse matrix exists and can be written as $P_{\pi }^{-1}=P_{\pi ^{-1}}=P_{\pi }^{\mathsf {T}}.$

Multiplying a row vector h times $P_{\pi }$ will permute the columns of the vector: $\mathbf {h} P_{\pi }={\begin{bmatrix}h_{1}&h_{2}&\cdots &h_{n}\end{bmatrix}}{\begin{bmatrix}\mathbf {e} _{\pi (1)}\\\mathbf {e} _{\pi (2)}\\\vdots \\\mathbf {e} _{\pi (n)}\end{bmatrix}}={\begin{bmatrix}h_{\pi ^{-1}(1)}&h_{\pi ^{-1}(2)}&\cdots &h_{\pi ^{-1}(n)}\end{bmatrix}}$

Again, repeated application of this result shows that post-multiplying a matrix $M$ by the permutation matrix $P π$ , that is, $M P π$ , results in permuting the columns of $M$ . Notice also that $\mathbf {e} _{k}P_{\pi }=\mathbf {e} _{\pi (k)}.$

Given two permutations $π$ and $σ$ of $m$ elements, the corresponding permutation matrices $P π$ and $P σ$ acting on column vectors are composed with $P_{\sigma }P_{\pi }\,\mathbf {g} =P_{\pi \,\circ \,\sigma }\,\mathbf {g} .$ The same matrices acting on row vectors (that is, post-multiplication) compose according to the same rule $\mathbf {h} P_{\sigma }P_{\pi }=\mathbf {h} P_{\pi \,\circ \,\sigma }.$ To be clear, the above formulas use the prefix notation for permutation composition, that is, $(\pi \,\circ \,\sigma )(k)=\pi \left(\sigma (k)\right).$

Let $Q_{\pi }$ be the permutation matrix corresponding to $π$ in its row representation. The properties of this representation can be determined from those of the column representation since $Q_{\pi }=P_{\pi }^{\mathsf {T}}=P_{{\pi }^{-1}}.$ In particular, $Q_{\pi }\mathbf {e} _{k}^{\mathsf {T}}=P_{{\pi }^{-1}}\mathbf {e} _{k}^{\mathsf {T}}=\mathbf {e} _{(\pi ^{-1})^{-1}(k)}^{\mathsf {T}}=\mathbf {e} _{\pi (k)}^{\mathsf {T}}.$ From this it follows that $Q_{\sigma }Q_{\pi }\,\mathbf {g} =Q_{\sigma \,\circ \,\pi }\,\mathbf {g} .$ Similarly, $\mathbf {h} \,Q_{\sigma }Q_{\pi }=\mathbf {h} \,Q_{\sigma \,\circ \,\pi }.$

Permutation matrices can be characterized as the orthogonal matrices whose entries are all non-negative.^[3]

Matrix group

If (1) denotes the identity permutation, then $P (1)$ is the identity matrix.

Let $S n$ denote the symmetric group, or group of permutations, on {1,2,..., $n$ }. Since there are $n!$ permutations, there are $n!$ permutation matrices. By the formulas above, the $n \times n$ permutation matrices form a group under matrix multiplication with the identity matrix as the identity element.

The map $S n \to GL(n, Z 2)$ that sends a permutation to its column representation is a faithful representation.

Doubly stochastic matrices

A permutation matrix is itself a doubly stochastic matrix, but it also plays a special role in the theory of these matrices. The Birkhoff–von Neumann theorem says that every doubly stochastic real matrix is a convex combination of permutation matrices of the same order and the permutation matrices are precisely the extreme points of the set of doubly stochastic matrices. That is, the Birkhoff polytope, the set of doubly stochastic matrices, is the convex hull of the set of permutation matrices.^[4]

Linear algebraic properties

The trace of a permutation matrix is the number of fixed points of the permutation. If the permutation has fixed points, so it can be written in cycle form as $π = (a 1)(a 2)...(a k)σ$ where $σ$ has no fixed points, then $e a 1, e a 2,..., e a k$ are eigenvectors of the permutation matrix.

To calculate the eigenvalues of a permutation matrix $P_{\sigma }$ , write $\sigma$ as a product of cycles, say, $\sigma =C_{1}C_{2}\cdots C_{t}$ . Let the corresponding lengths of these cycles be $l_{1},l_{2}...l_{t}$ , and let $R_{i}(1\leq i\leq t)$ be the set of complex solutions of $x^{l_{i}}=1$ . The union of all $R_{i}$ s is the set of eigenvalues of the corresponding permutation matrix. The geometric multiplicity of each eigenvalue equals the number of $R_{i}$ s that contain it.^[5]

From group theory we know that any permutation may be written as a product of transpositions. Therefore, any permutation matrix $P$ factors as a product of row-interchanging elementary matrices, each having determinant −1. Thus, the determinant of a permutation matrix $P$ is the signature of the corresponding permutation.

Examples

Permutation of rows and columns

When a matrix M is multiplied by a permutation matrix P on the left to make PM, the product is the result of permuting the rows of M. As a special case, if M is a column vector, then PM is the result of permuting the entries of M:

When instead M is multiplied by a permutation matrix on the right to make MP, the product is the result of permuting the columns of M. As a special case, if M is a row vector, then MP is the result of permuting the entries of M:

Permutation of rows

The permutation matrix P_π corresponding to the permutation $\pi ={\begin{pmatrix}1&2&3&4&5\\1&4&2&5&3\end{pmatrix}}$ is

P_{\pi }={\begin{bmatrix}\mathbf {e} _{\pi (1)}\\\mathbf {e} _{\pi (2)}\\\mathbf {e} _{\pi (3)}\\\mathbf {e} _{\pi (4)}\\\mathbf {e} _{\pi (5)}\end{bmatrix}}={\begin{bmatrix}\mathbf {e} _{1}\\\mathbf {e} _{4}\\\mathbf {e} _{2}\\\mathbf {e} _{5}\\\mathbf {e} _{3}\end{bmatrix}}={\begin{bmatrix}1&0&0&0&0\\0&0&0&1&0\\0&1&0&0&0\\0&0&0&0&1\\0&0&1&0&0\end{bmatrix}}.

Given a vector g,

P_{\pi }\mathbf {g} ={\begin{bmatrix}\mathbf {e} _{\pi (1)}\\\mathbf {e} _{\pi (2)}\\\mathbf {e} _{\pi (3)}\\\mathbf {e} _{\pi (4)}\\\mathbf {e} _{\pi (5)}\end{bmatrix}}{\begin{bmatrix}g_{1}\\g_{2}\\g_{3}\\g_{4}\\g_{5}\end{bmatrix}}={\begin{bmatrix}g_{1}\\g_{4}\\g_{2}\\g_{5}\\g_{3}\end{bmatrix}}.

Explanation

A permutation matrix will always be in the form

{\begin{bmatrix}\mathbf {e} _{a_{1}}\\\mathbf {e} _{a_{2}}\\\vdots \\\mathbf {e} _{a_{j}}\\\end{bmatrix}}

where e_{a_i} represents the ith basis vector (as a row) for R^j, and where

{\begin{bmatrix}1&2&\ldots &j\\a_{1}&a_{2}&\ldots &a_{j}\end{bmatrix}}

is the permutation form of the permutation matrix.

Now, in performing matrix multiplication, one essentially forms the dot product of each row of the first matrix with each column of the second. In this instance, we will be forming the dot product of each row of this matrix with the vector of elements we want to permute. That is, for example, v = (g₀,...,g₅)^T,

e_{a_i}·v = g_{a_i}

So, the product of the permutation matrix with the vector v above, will be a vector in the form (g_a₁, g_a₂, ..., g_{a_j}), and that this then is a permutation of v since we have said that the permutation form is

{\begin{pmatrix}1&2&\ldots &j\\a_{1}&a_{2}&\ldots &a_{j}\end{pmatrix}}.

So, permutation matrices do indeed permute the order of elements in vectors multiplied with them.

Restricted forms

Costas array, a permutation matrix in which the displacement vectors between the entries are all distinct
n-queens puzzle, a permutation matrix in which there is at most one entry in each diagonal and antidiagonal

References

^ Terminology is not standard. Most authors choose one representation to be consistent with other notation they have introduced, so there is generally no need to supply a name.
^ Brualdi (2006) p.2
^ Zavlanos, Michael M.; Pappas, George J. (November 2008). "A dynamical systems approach to weighted graph matching". Automatica. 44 (11): 2817–2824. doi:10.1016/j.automatica.2008.04.009. S2CID 834305. Retrieved 21 August 2022. In particular, since permutation matrices are orthogonal matrices with nonnegative elements, we define two gradient flows in the space of orthogonal matrices... Lemma 5: Let $O_{n}$ denote the set of $n\times n$ orthogonal matrices and $N_{n}$ denote the set of $n\times n$ element-wise non-negative matrices. Then, $P_{n}=O_{n}\cap N_{n}$ , where $P_{n}$ is the set of $n\times n$ permutation matrices.
^ Brualdi (2006) p.19
^ J Najnudel, A Nikeghbali 2010 p.4

Brualdi, Richard A. (2006). Combinatorial matrix classes. Encyclopedia of Mathematics and Its Applications. Vol. 108. Cambridge: Cambridge University Press. ISBN 0-521-86565-4. Zbl 1106.05001.
Joseph, Najnudel; Ashkan, Nikeghbali (2010), The Distribution of Eigenvalues of Randomized Permutation Matrices, arXiv:1005.0402, Bibcode:2010arXiv1005.0402N

[1] Terminology is not standard. Most authors choose one representation to be consistent with other notation they have introduced, so there is generally no need to supply a name.

[Bru2-2] Brualdi (2006) p.2

[3] Zavlanos, Michael M.; Pappas, George J. (November 2008). "A dynamical systems approach to weighted graph matching". Automatica. 44 (11): 2817–2824. doi:10.1016/j.automatica.2008.04.009. S2CID 834305. Retrieved 21 August 2022. In particular, since permutation matrices are orthogonal matrices with nonnegative elements, we define two gradient flows in the space of orthogonal matrices... Lemma 5: Let $O_{n}$ denote the set of $n\times n$ orthogonal matrices and $N_{n}$ denote the set of $n\times n$ element-wise non-negative matrices. Then, $P_{n}=O_{n}\cap N_{n}$ , where $P_{n}$ is the set of $n\times n$ permutation matrices.

[Bru19-4] Brualdi (2006) p.19

[J_Najnudel2010_4-5] J Najnudel, A Nikeghbali 2010 p.4

[1]

[2]

[3]

[4]

[5]

v t e Matrix classes
Explicitly constrained entries	Alternant Anti-diagonal Anti-Hermitian Anti-symmetric Arrowhead Band Bidiagonal Bisymmetric Block-diagonal Block Block tridiagonal Boolean Cauchy Centrosymmetric Conference Complex Hadamard Copositive Diagonally dominant Diagonal Discrete Fourier Transform Elementary Equivalent Frobenius Generalized permutation Hadamard Hankel Hermitian Hessenberg Hollow Integer Logical Matrix unit Metzler Moore Nonnegative Pentadiagonal Permutation Persymmetric Polynomial Quaternionic Signature Skew-Hermitian Skew-symmetric Skyline Sparse Sylvester Symmetric Toeplitz Triangular Tridiagonal Vandermonde Walsh Z
Constant	Exchange Hilbert Identity Lehmer Of ones Pascal Pauli Redheffer Shift Zero
Conditions on eigenvalues or eigenvectors	Companion Convergent Defective Definite Diagonalizable Hurwitz-stable Positive-definite Stieltjes
Satisfying conditions on products or inverses	Congruent Idempotent or Projection Invertible Involutory Nilpotent Normal Orthogonal Unimodular Unipotent Unitary Totally unimodular Weighing
With specific applications	Adjugate Alternating sign Augmented Bézout Carleman Cartan Circulant Cofactor Commutation Confusion Coxeter Distance Duplication and elimination Euclidean distance Fundamental (linear differential equation) Generator Gram Hessian Householder Jacobian Moment Payoff Pick Random Rotation Routh-Hurwitz Seifert Shear Similarity Symplectic Totally positive Transformation
Used in statistics	Centering Correlation Covariance Design Doubly stochastic Fisher information Hat Precision Stochastic Transition
Used in graph theory	Adjacency Biadjacency Degree Edmonds Incidence Laplacian Seidel adjacency Tutte
Used in science and engineering	Cabibbo–Kobayashi–Maskawa Density Fundamental (computer vision) Fuzzy associative Gamma Gell-Mann Hamiltonian Irregular Overlap S State transition Substitution Z (chemistry)
Related terms	Jordan normal form Linear independence Matrix exponential Matrix representation of conic sections Perfect matrix Pseudoinverse Row echelon form Wronskian
Mathematics portal List of matrices Category:Matrices