Schur complement

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In linear algebra and the theory of matrices, the Schur complement of a matrix block (i.e., a submatrix within a larger matrix) is defined as follows. Suppose A, B, C, D are respectively p×p, p×q, q×p and q×q matrices, and D is invertible. Let

M=\left[\begin{matrix} A & B \\ C & D \end{matrix}\right]

so that M is a (p+q)×(p+q) matrix.

Then the Schur complement of the block D of the matrix M is the p×p matrix


It is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously.[1] Emilie Haynsworth was the first to call it the Schur complement.[2] The Schur complement is a key tool in the fields of numerical analysis, statistics and matrix analysis.


The Schur complement arises as the result of performing a block Gaussian elimination by multiplying the matrix M from the right with the "block lower triangular" matrix

L=\left[\begin{matrix} I_p & 0 \\ -D^{-1}C & I_q \end{matrix}\right].

Here Ip denotes a p×p identity matrix. After multiplication with the matrix L the Schur complement appears in the upper p×p block. The product matrix is

ML &= \left[\begin{matrix} A & B \\ C & D \end{matrix}\right]\left[\begin{matrix} I_p & 0 \\ -D^{-1}C & I_q \end{matrix}\right] = \left[\begin{matrix} A-BD^{-1}C & B \\ 0 & D \end{matrix}\right] \\
&= \left[\begin{matrix} I_p & BD^{-1} \\ 0 & I_q \end{matrix}\right] \left[\begin{matrix} A-BD^{-1}C & 0 \\ 0 & D \end{matrix}\right].

This is analogous to an LDU decomposition. That is, we have shown that

\left[\begin{matrix} A & B \\ C & D \end{matrix}\right] &= \left[\begin{matrix} I_p & BD^{-1} \\ 0 & I_q \end{matrix}\right] \left[\begin{matrix} A-BD^{-1}C & 0 \\ 0 & D \end{matrix}\right]
\left[ \begin{matrix} I_p & 0 \\ D^{-1}C & I_q \end{matrix}\right],

and inverse of M thus may be expressed involving D−1 and the inverse of Schur's complement (if it exists) only as

& {} \quad \left[ \begin{matrix} A & B \\ C & D \end{matrix}\right]^{-1} =
\left[ \begin{matrix} I_p & 0 \\ -D^{-1}C & I_q \end{matrix}\right]
\left[ \begin{matrix} (A-BD^{-1}C)^{-1} & 0 \\ 0 & D^{-1} \end{matrix}\right]
\left[ \begin{matrix} I_p & -BD^{-1} \\ 0 & I_q \end{matrix}\right] \\[12pt]
& = \left[ \begin{matrix} \left(A-B D^{-1} C \right)^{-1}  &   -\left(A-B D^{-1} C \right)^{-1} B D^{-1} \\ -D^{-1}C\left(A-B D^{-1} C \right)^{-1} & D^{-1}+ D^{-1} C \left(A-B D^{-1} C \right)^{-1} B D^{-1} \end{matrix} \right].

C.f. matrix inversion lemma which illustrates relationships between the above and the equivalent derivation with the roles of A and D interchanged.

If M is a positive-definite symmetric matrix, then so is the Schur complement of D in M.

If p and q are both 1 (i.e. A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:

 M^{-1} = \frac{1}{AD-BC} \left[ \begin{matrix} D & -B \\ -C & A \end{matrix}\right]

provided that AD − BC is non-zero.

Moreover, the determinant of M is also clearly seen to be given by

 \det(M) = \det(D) \det(A - BD^{-1} C)

which generalizes the determinant formula for 2x2 matrices.

Application to solving linear equations[edit]

The Schur complement arises naturally in solving a system of linear equations such as

Ax + By = a \,
Cx + Dy = b \,

where x, a are p-dimensional column vectors, y, b are q-dimensional column vectors, and A, B, C, D are as above. Multiplying the bottom equation by BD^{-1} and then subtracting from the top equation one obtains

(A - BD^{-1} C) x = a - BD^{-1} b.\,

Thus if one can invert D as well as the Schur complement of D, one can solve for x, and then by using the equation Cx + Dy = b one can solve for y. This reduces the problem of inverting a (p+q) \times (p+q) matrix to that of inverting a p×p matrix and a q×q matrix. In practice one needs D to be well-conditioned in order for this algorithm to be numerically accurate.

In electrical engineering this is often referred to as node elimination or Kron reduction.

Applications to probability theory and statistics[edit]

Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn+m has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix

\Sigma=\left[\begin{matrix} A & B \\ B^T & C \end{matrix}\right],

where A \in \mathbb{R}^{n\times n} is the covariance matrix of X, C \in \mathbb{R}^{m\times m} is the covariance matrix of Y and B \in \mathbb{R}^{n\times m} is the covariance matrix between X and Y.

Then the conditional covariance of X given Y is the Schur complement of C in \Sigma:

\operatorname{Cov}(X\mid Y) = A-BC^{-1}B^T.
\operatorname{E}(X\mid Y) = \operatorname{E}(X) + BC^{-1}(Y-\operatorname{E}(Y)).

If we take the matrix \Sigma above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in \Sigma also has a Wishart distribution.[citation needed]

Schur complement condition for positive definiteness[edit]

Let X be a symmetric matrix given by

X=\left[\begin{matrix} A & B \\ B^T & C \end{matrix}\right].

Let S be the Schur complement of A in X, that is:

S= C - B^T A^{-1} B . \,


  • X is positive definite if and only if A and S are both positive definite:
X \succ  0 \Leftrightarrow  A \succ  0, S = C - B^T A^{-1} B \succ  0.
  • X is positive definite if and only if C and A - B C^{-1} B^T are both positive definite:
X \succ  0 \Leftrightarrow C \succ  0, A - B C^{-1} B^T \succ  0.
  • If A is positive definite, then X is positive semidefinite if and only if S is positive semidefinite:
\text{If} A \succ  0, \text{then} X \succeq  0 \Leftrightarrow S =  C - B^T A^{-1} B \succeq  0.
  • If C is positive definite, then X is positive semidefinite if and only if A - B C^{-1} B^T is positive semidefinite:
\text{If} C \succ  0, \text{then} X \succeq  0 \Leftrightarrow A - B C^{-1} B^T \succeq  0.

The first and third statements can be derived[3] by considering the minimizer of the quantity

 u^T A u + 2 v^T B^T u + v^T C v, \,

as a function of v (for fixed u).

Furthermore, since

 \left[\begin{matrix} A & B \\ B^T & C \end{matrix}\right] \succ 0 \Longleftrightarrow \left[\begin{matrix} C & B^T \\ B & A \end{matrix}\right] \succ 0

and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third).

See also[edit]


  1. ^ Zhang, Fuzhen (2005). The Schur Complement and Its Applications. Springer. doi:10.1007/b105056. ISBN 0-387-24271-6. 
  2. ^ Haynsworth, E. V., "On the Schur Complement", Basel Mathematical Notes, #BNB 20, 17 pages, June 1968.
  3. ^ Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)