Rank (linear algebra)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In linear algebra, the rank of a matrix A is a measure of the "nondegenerateness" of the system of linear equations and linear transformation encoded by A. There are many possible definitions of rank, including the size of the largest collection of linearly independent columns of A. Others are listed in the following section. The rank is one of the fundamental pieces of data associated to a matrix.

The rank is commonly denoted by either rk(A) or rank(A); sometimes the parentheses are unwritten, as in rk A.

Contents

Main definitions [edit]

In this section we give three definitions of the rank of a matrix. Many other definitions are possible; see below for a list of several of these.

The column rank of a matrix A is the maximum number of linearly independent column vectors of A. The row rank of a A is the maximum number of linearly independent row vectors of A. Equivalently, the column rank of A is the dimension of the column space of A, while the row rank of A is the dimension of the row space of A.

A result of fundamental importance in linear algebra is that the column rank and the row rank are always equal. (Two proofs of this result are given below.) This number (i.e., the number of linearly independent rows or columns) is simply called the rank of A.

The rank is also the dimension of the image of the linear transformation that is given by multiplication by A. More generally, if a linear operator on a vector space (possibly infinite-dimensional) has finite-dimensional image (e.g., a finite-rank operator), then the rank of the operator is defined as the dimension of the image.

Examples [edit]

The matrix

\begin{bmatrix}1&2&1\\-2&-3&1\\3&5&0\end{bmatrix}

has rank 2: the first two rows are linearly independent, so the rank is at least 2, but all three rows are linearly dependent (the first is equal to the sum of the second and third) so the rank must be less than 3.

The matrix

A=\begin{bmatrix}1&1&0&2\\-1&-1&0&-2\end{bmatrix}

has rank 1: there are nonzero columns, so the rank is positive, but any pair of columns are linearly dependent. Similarly, the transpose

A^T = \begin{bmatrix}1&-1\\1&-1\\0&0\\2&-2\end{bmatrix}

of A has rank 1. Indeed, since the column vectors of A are the row vectors of the transpose of A, the statement that the column rank of a matrix equals its row rank is equivalent to the statement that the rank of a matrix is equal to the rank of its transpose, i.e., rk(A) = rk(AT).

Computing the rank of a matrix [edit]

Rank from row echelon forms [edit]

A common approach to finding the rank of a matrix is to reduce it to a simpler form, generally row echelon form, by elementary row operations. Row operations do not change the row space (hence do not change the row rank), and, being invertible, map the column space to an isomorphic space (hence do not change the column rank). Once in row echelon form, the rank is clearly the same for both row rank and column rank, and equals the number of pivots (or basic columns) and also the number of non-zero rows.

For example, the matrix A given by

A=\begin{bmatrix}1&2&1\\-2&-3&1\\3&5&0\end{bmatrix}

can be put in reduced row-echelon form by using the following elementary row operations:

\begin{bmatrix}1&2&1\\-2&-3&1\\3&5&0\end{bmatrix}R_2\rightarrow 2r_1 + r_2 \begin{bmatrix}1&2&1\\0&1&3\\3&5&0\end{bmatrix} R_3 \rightarrow -3r_1 + r_3 \begin{bmatrix}1&2&1\\0&1&3\\0&-1&-3\end{bmatrix} R_3 \rightarrow r_2 + r_3 \begin{bmatrix}1&2&1\\0&1&3\\0&0&0\end{bmatrix} R_1 \rightarrow -2r_2 + r_1 \begin{bmatrix}1&0&-5\\0&1&3\\0&0&0\end{bmatrix}.

The final matrix (in reduced row echelon form) has two non-zero rows and thus the rank of matrix A is 2.

Computation [edit]

When applied to floating point computations on computers, basic Gaussian elimination (LU decomposition) can be unreliable, and a rank-revealing decomposition should be used instead. An effective alternative is the singular value decomposition (SVD), but there are other less expensive choices, such as QR decomposition with pivoting (so-called rank-revealing QR factorization), which are still more numerically robust than Gaussian elimination. Numerical determination of rank requires a criterion for deciding when a value, such as a singular value from the SVD, should be treated as zero, a practical choice which depends on both the matrix and the application.

Proofs that column rank = row rank [edit]

The fact that the column and row ranks of any matrix are equal forms an important part of the fundamental theorem of linear algebra. We present two proofs of this result. The first is short, uses only basic properties of linear combinations of vectors, and is valid over any field. The second is an elegant argument using orthogonality and is valid for matrices over the real numbers; it is based upon Mackiw (1995).

First proof [edit]

Let A be an m × n matrix whose column rank is r. Therefore, the dimension of the column space of A is r. Let c_1,c_2,\ldots,c_r be any basis for the column space of A and place these vectors as column vectors to form the m × r matrix C = [c_1 \, c_2 \, \dots \, c_r]. By the definition of basis, each column vector of A is a linear combination of the r columns of C. Let R be the matrix whose (i, j)th element is the coefficient of ci when the j-th column of A is expressed as a linear combination of the r columns of C. From the definition of matrix multiplication, this r × n matrix R satisfies A = C R. (This is known as a rank factorization of A.)

Now, since A = C R, every row vector of A is a linear combination of the row vectors of R. (In particular, the (i, j)-th entry of C is the coefficient of the j-th row vector of R when the i-th row of A is expressed as a linear combination of the r rows of R.) This means that the row space of A is contained within the row space of R. Therefore, the row rank of A must be no larger than the row rank of R. But since R has r rows, the row rank of R is no larger than r, which is the column rank of A. This proves that row rank of A is less than or equal to the column rank of A. Now apply the result to the transpose of A to get the reverse inequality: the column rank of A, which is equal to the row rank of AT, is less than or equal to the column rank of AT, which is equal to the row rank of A. This proves that the column rank of A is greater than or equal to the row rank of A, and vice-versa, so in fact the two must be equal, as claimed.

Second proof [edit]

Let A be an m × n matrix with entries in the real numbers whose row rank is r. Therefore, the dimension of the row space of A is r. Let x_1, x_2,\ldots, x_r be a basis of the row space of A. We claim that the vectors Ax_1, Ax_2,\ldots, Ax_r are linearly independent. To see why, consider a linear homogeneous relation involving these vectors with scalar coefficients c_1,c_2,\ldots,c_r:

0 = c_1 Ax_1 + c_2 Ax_2 + \cdots + c_r Ax_r = A(c_1x_1 + c_2x_2 + \cdots + c_rx_r) = Av,

where v = c_1x_1 + c_2x_2 + \cdots + c_r x_r. We make two observations: (a) v is a linear combination of vectors in the row space of A, which implies that v belongs to the row space of A, and (b) since A v = 0, the vector v is orthogonal to every row vector of A and, hence, is orthogonal to every vector in the row space of A. The facts (a) and (b) together imply that v is orthogonal to itself, which proves that v = 0 or, by the definition of v,

c_1x_1 + c_2x_2 + \cdots + c_r x_r = 0.

But recall that the x_i were chosen as a basis of the row space of A and so are linearly independent. This implies that c_1 = c_2 = \cdots = c_r = 0. It follows that Ax_1, Ax_2,\ldots, Ax_r are linearly independent.

Now, each Ax_i is obviously a vector in the column space of A. So, Ax_1, Ax_2,\ldots, Ax_r is a set of r linearly independent vectors in the column space of A and, hence, the dimension of the column space of A (i.e., the column rank of A) must be at least as big as r. This proves that row rank of A is no larger than the column rank of A. Now apply this result to the transpose of A to get the reverse inequality and conclude as in the previous proof.

Alternative definitions [edit]

In all the definitions in this section, the matrix A is taken to be an m × n matrix over an arbitrary field F.

dimension of image

Given the matrix A, there is an associated linear mapping

f : FnFm

defined by

f(x) = Ax.

The rank of A is the dimension of the image of f. This definition has the advantage that it can be applied to any linear map without need for a specific matrix.

rank in terms of nullity

Given the same linear mapping f as above, the rank is n minus the dimension of the kernel of f. The rank–nullity theorem states that this definition is equivalent to the preceding one.

column rank – dimension of column space

The rank of A is the maximal number of linearly independent columns c_1,c_2,\dots,c_k of A; this is the dimension of the column space of A (the column space being the subspace of Fm generated by the columns of A, which is in fact just the image of the linear map f associated to A).

row rank – dimension of row space

The rank of A is the maximal number of linearly independent rows of A; this is the dimension of the row space of A.

decomposition rank

The rank of A is the smallest integer k such that A can be factored as A=CR, where C is an m × k matrix and R is a k × n matrix.

As in the case of the "dimension of image" characterization, this can be generalized to a definition of the rank of any linear map: the rank of a linear map f : VW is the minimal dimension k of an intermediate space X such that f can be written as the composition of a map VX and a map XW. Unfortunately, this definition does not suggest an efficient manner to compute the rank (for which it is better to use one of the alternative definitions). See rank factorization for details.

determinantal rank – size of largest non-vanishing minor

The rank of A is the largest order of any non-zero minor in A. (The order of a minor is the side-length of the square sub-matrix of which it is the determinant.) Like the decomposition rank characterization, this does not give an efficient way of computing the rank, but it is useful theoretically: a single non-zero minor witnesses a lower bound (namely its order) for the rank of the matrix, which can be useful (for example) to prove that certain operations do not lower the rank of a matrix.

A non-vanishing p-minor (p × p submatrix with non-zero determinant) shows that the rows and columns of that submatrix are linearly independent, and thus those rows and columns of the full matrix are linearly independent (in the full matrix), so the row and column rank are at least as large as the determinantal rank; however, the converse is less straightforward. The equivalence of determinantal rank and column rank is a strengthening of the statement that if the span of n vectors has dimension p, then p of those vectors span the space (equivalently, that one can choose a spanning set that is a subset of the vectors): the equivalence implies that a subset of the rows and a subset of the columns simultaneously define an invertible submatrix (equivalently, if the span of n vectors has dimension p, then p of these vectors span the space and there is a set of p coordinates on which they are linearly independent).

tensor rank – minimum number of simple tensors

The rank of A is the smallest number k such that A can be written as a sum of k rank-1 matrices. Note that this definition appears circular but actually is not, since rank-1 matrices can be defined without reference to the general definition of rank; in particular, a matrix M has rank 1 if and only if it can be written as a product c \cdot r of a column vector c and a row vector r. This notion of rank is called tensor rank; it can be generalized in the separable models interpretation of the singular value decomposition.

Properties [edit]

We assume that A is an m × n matrix, and we define the linear map f by f(x) = Ax as above.

  • The rank of an m × n matrix is a nonnegative integer and cannot be greater than either m or n. That is, rk(A) ≤ min(m, n). A matrix that has a rank as large as possible is said to have full rank; otherwise, the matrix is rank deficient.
  • Only a zero matrix has rank zero.
  • f is injective if and only if A has rank n (in this case, we say that A has full column rank).
  • f is surjective if and only if A has rank m (in this case, we say that A has full row rank).
  • If A is a square matrix (i.e., m = n), then A is invertible if and only if A has rank n (that is, A has full rank).
  • If B is any n × k matrix, then
\operatorname{rank}(AB) \leq \min(\operatorname{rank}\ A, \operatorname{rank}\ B).
  • If B is an n × k matrix of rank n, then
\operatorname{rank}(AB) = \operatorname{rank}(A).
  • If C is an l × m matrix of rank m, then
\operatorname{rank}(CA) = \operatorname{rank}(A).
  • The rank of A is equal to r if and only if there exists an invertible m × m matrix X and an invertible n × n matrix Y such that

  XAY =
  \begin{bmatrix}
    I_r & 0 \\
    0 & 0 \\
  \end{bmatrix},
where Ir denotes the r × r identity matrix.
  • Sylvester’s rank inequality: if A is an m × n matrix and B is n × k, then
\operatorname{rank}(A) + \operatorname{rank}(B) - n \leq \operatorname{rank}(A B).[1]
This is a special case of the next inequality.
  • The inequality due to Frobenius: if AB, ABC and BC are defined, then
\operatorname{rank}(AB) + \operatorname{rank}(BC) \le \operatorname{rank}(B) + \operatorname{rank}(ABC).[2]
  • Subadditivity: rank(A + B) ≤ rank(A) + rank(B) when A and B are of the same dimension. As a consequence, a rank-k matrix can be written as the sum of k rank-1 matrices, but not fewer.
  • The rank of a matrix plus the nullity of the matrix equals the number of columns of the matrix. (This is the rank–nullity theorem.)
  • If A is a matrix over the real numbers then the rank of A and the rank of its corresponding Gram matrix are equal. Thus, for real matrices
\operatorname{rank}(A^T A) = \operatorname{rank}(A A^T) = \operatorname{rank}(A) = \operatorname{rank}(A^T).
This can be shown by proving equality of their null spaces. Null space of the Gram matrix is given by vectors x for which A^T A x = 0. If this condition is fulfilled, also holds 0 = x^T A^T A x = |A x|^2. [3]
  • If A is a matrix over the complex numbers and A* denotes the conjugate transpose of A (i.e., the adjoint of A), then
\operatorname{rank}(A) = \operatorname{rank}(\overline{A}) = \operatorname{rank}(A^T) = \operatorname{rank}(A^*) = \operatorname{rank}(A^*A).

Applications [edit]

One useful application of calculating the rank of a matrix is the computation of the number of solutions of a system of linear equations. According to the Rouché–Capelli theorem, the system is inconsistent if the rank of the augmented matrix is greater than the rank of the coefficient matrix. If, on the other hand, the ranks of these two matrices are equal, then the system must have at least one solution. The solution is unique if and only if the rank equals the number of variables. Otherwise the general solution has k free parameters where k is the difference between the number of variables and the rank. In this case (and assuming the system of equations is in the real or complex numbers) the system of equations has infinitely many solutions.

In control theory, the rank of a matrix can be used to determine whether a linear system is controllable, or observable.

Generalization [edit]

There are different generalisations of the concept of rank to matrices over arbitrary rings. In those generalisations, column rank, row rank, dimension of column space and dimension of row space of a matrix may be different from the others or may not exist.

Thinking of matrices as tensors, the tensor rank generalizes to arbitrary tensors; note that for tensors of order greater than 2 (matrices are order 2 tensors), rank is very hard to compute, unlike for matrices.

There is a notion of rank for smooth maps between smooth manifolds. It is equal to the linear rank of the derivative.

Matrices as tensors [edit]

Matrix rank should not be confused with tensor order, which is called tensor rank. Tensor order is the number of indices required to write a tensor, and thus matrices all have tensor order 2. More precisely, matrices are tensors of type (1,1), having one row index and one column index, also called covariant order 1 and contravariant order 1; see Tensor (intrinsic definition) for details.

Note that the tensor rank of a matrix can also mean the minimum number of simple tensors necessary to express the matrix as a linear combination, and that this definition does agree with matrix rank as here discussed.

See also [edit]

References [edit]

  • Mackiw, G. (1995), "A Note on the Equality of the Column and Row Rank of a Matrix", Mathematics Magazine 68 (4) 
  1. ^ Proof: Apply the rank–nullity theorem to the inequality
    \dim \operatorname{ker}(AB) \le \dim \operatorname{ker}(A) + \dim \operatorname{ker}(B).
  2. ^ Proof: The map
    C: \operatorname{ker}(ABC) / \operatorname{ker}(BC) \to \operatorname{ker}(AB) / \operatorname{ker}(B)
    is well-defined and injective. We thus obtain the inequality in terms of dimensions of kernel, which can then be converted to the inequality in terms of ranks by the rank–nullity theorem. Alternatively, if M is a linear subspace then dim(AM) ≤ dim(M); apply this inequality to the subspace defined by the (orthogonal) complement of the image of BC in the image of B, whose dimension is rk(B) – rk(BC); its image under A has dimension rk(AB) – rk(ABC)
  3. ^ Leon Mirsky: An Introduction to Linear Algebra, 1990, ISBN 0-486-66434-1

Further reading [edit]

  • Horn, Roger A. and Johnson, Charles R. Matrix Analysis. Cambridge University Press, 1985. ISBN 0-521-38632-2.
  • Kaw, Autar K. Two Chapters from the book Introduction to Matrix Algebra: 1. Vectors [1] and System of Equations [2]
  • Mike Brookes: Matrix Reference Manual. [3]