# Schur complement

Jump to navigation Jump to search

In linear algebra and the theory of matrices, the Schur complement of a block matrix is defined as follows.

Suppose p, q are nonnegative integers, and suppose A, B, C, D are respectively p × p, p × q, q × p, and q × q matrices of complex numbers. Let

$M=\left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]$ so that M is a (p + q) × (p + q) matrix.

If D is invertible, then the Schur complement of the block D of the matrix M is the p × p matrix defined by

$M/D:=A-BD^{-1}C.$ If A is invertible, the Schur complement of the block A of the matrix M is the q × q matrix defined by

$M/A:=D-CA^{-1}B.$ In the case that A or D is singular, substituting a generalized inverse for the inverses on M/A and M/D yields the generalized Schur complement.

The Schur complement is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously. Emilie Virginia Haynsworth was the first to call it the Schur complement. The Schur complement is a key tool in the fields of numerical analysis, statistics, and matrix analysis.

## Background

The Schur complement arises as the result of performing a block Gaussian elimination by multiplying the matrix M from the right with a block lower triangular matrix

$L={\begin{bmatrix}I_{p}&0\\-D^{-1}C&I_{q}\end{bmatrix}}.$ Here Ip denotes a p×p identity matrix. After multiplication with the matrix L the Schur complement appears in the upper p×p block. The product matrix is

{\begin{aligned}ML&={\begin{bmatrix}A&B\\C&D\end{bmatrix}}{\begin{bmatrix}I_{p}&0\\-D^{-1}C&I_{q}\end{bmatrix}}={\begin{bmatrix}A-BD^{-1}C&B\\0&D\end{bmatrix}}\\[4pt]&={\begin{bmatrix}I_{p}&BD^{-1}\\0&I_{q}\end{bmatrix}}{\begin{bmatrix}A-BD^{-1}C&0\\0&D\end{bmatrix}}\\[6pt]\Rightarrow M&={\begin{bmatrix}A&B\\C&D\end{bmatrix}}={\begin{bmatrix}I_{p}&BD^{-1}\\0&I_{q}\end{bmatrix}}{\begin{bmatrix}A-BD^{-1}C&0\\0&D\end{bmatrix}}{\begin{bmatrix}I_{p}&0\\D^{-1}C&I_{q}\end{bmatrix}},\end{aligned}} I.e., an LDU decomposition. Thus, the inverse of M may be expressed involving D−1 and the inverse of Schur's complement, if it exists, as

{\begin{aligned}M^{-1}={\begin{bmatrix}A&B\\C&D\end{bmatrix}}^{-1}={}&\left({\begin{bmatrix}I_{p}&BD^{-1}\\0&I_{q}\end{bmatrix}}{\begin{bmatrix}A-BD^{-1}C&0\\0&D\end{bmatrix}}{\begin{bmatrix}I_{p}&0\\D^{-1}C&I_{q}\end{bmatrix}}\right)^{-1}\\={}&{\begin{bmatrix}I_{p}&0\\-D^{-1}C&I_{q}\end{bmatrix}}{\begin{bmatrix}\left(A-BD^{-1}C\right)^{-1}&0\\0&D^{-1}\end{bmatrix}}{\begin{bmatrix}I_{p}&-BD^{-1}\\0&I_{q}\end{bmatrix}}\\[4pt]={}&{\begin{bmatrix}\left(A-BD^{-1}C\right)^{-1}&-\left(A-BD^{-1}C\right)^{-1}BD^{-1}\\-D^{-1}C\left(A-BD^{-1}C\right)^{-1}&D^{-1}+D^{-1}C\left(A-BD^{-1}C\right)^{-1}BD^{-1}\end{bmatrix}}\\[4pt]={}&{\begin{bmatrix}\left(M/D\right)^{-1}&-\left(M/D\right)^{-1}BD^{-1}\\-D^{-1}C\left(M/D\right)^{-1}&D^{-1}+D^{-1}C\left(M/D\right)^{-1}BD^{-1}\end{bmatrix}}.\end{aligned}} Cf. matrix inversion lemma which illustrates relationships between the above and the equivalent derivation with the roles of A and D interchanged.

In another interpretation, the Schur complement also comes up in solving linear equations, by eliminating one block of variables. We start with,

$M={\begin{bmatrix}A&B\\C&D\end{bmatrix}}{\begin{bmatrix}x\\y\end{bmatrix}}={\begin{bmatrix}u\\v\end{bmatrix}}$ .

Assuming that the submatrix $A$ is invertible, we can eliminate $x$ from the equations, as follows.

$x=A^{-1}(u-By)$ .

Substituting this expression into the second equation yields

$\left(D-CA^{-1}B\right)y=v-CA^{-1}u$ .

We refer to this as the reduced equation obtained by eliminating $x$ from the original equation. The matrix appearing in the reduced equation is called the Schur complement of the first block $A$ in $M$ :

$S\ {\overset {\underset {\mathrm {def} }{}}{=}}\ D-CA^{-1}B$ .

Solving the reduced equation, we obtain

$y=S^{-1}\left(v-CA^{-1}u\right)$ .

Substituting this into the first equation yields

$x=\left(A^{-1}+A^{-1}BS^{-1}CA^{-1}\right)u-A^{-1}BS^{-1}v$ .

We can express the above two equation as:

${\begin{bmatrix}x\\y\end{bmatrix}}={\begin{bmatrix}A^{-1}+A^{-1}BS^{-1}CA^{-1}&-A^{-1}BS^{-1}\\-S^{-1}CA^{-1}&S^{-1}\end{bmatrix}}{\begin{bmatrix}u\\v\end{bmatrix}}$ .

Therefore, a formulation for the inverse of a block matrix is:

${\begin{bmatrix}A&B\\C&D\end{bmatrix}}^{-1}={\begin{bmatrix}A^{-1}+A^{-1}BS^{-1}CA^{-1}&-A^{-1}BS^{-1}\\-S^{-1}CA^{-1}&S^{-1}\end{bmatrix}}$ .

In particular, we see that the Schur complement is the inverse of the $2,2$ block entry of the inverse of $M$ .

## Properties

• If p and q are both 1 (i.e., A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:
$M^{-1}={\frac {1}{AD-BC}}\left[{\begin{matrix}D&-B\\-C&A\end{matrix}}\right]$ provided that AD − BC is non-zero.
• In general, if A is invertible, then
{\begin{aligned}M&={\begin{bmatrix}I_{p}&0\\CA^{-1}&I_{q}\end{bmatrix}}{\begin{bmatrix}A&0\\0&D-CA^{-1}B\end{bmatrix}}{\begin{bmatrix}I_{p}&A^{-1}B\\0&I_{q}\end{bmatrix}},\\[4pt]M^{-1}&={\begin{bmatrix}A^{-1}+A^{-1}B(M/A)^{-1}CA^{-1}&-A^{-1}B(M/A)^{-1}\\-(M/A)^{-1}CA^{-1}&(M/A)^{-1}\end{bmatrix}}\end{aligned}} whenever this inverse exists.
• When A, respectively D, is invertible, the determinant of M is also clearly seen to be given by
$\det(M)=\det(A)\det \left(D-CA^{-1}B\right)$ , respectively
$\det(M)=\det(D)\det \left(A-BD^{-1}C\right)$ ,
which generalizes the determinant formula for 2 × 2 matrices.
• (Guttman rank additivity formula) If D is invertible, then the rank of M is given by
$\operatorname {rank} (M)=\operatorname {rank} (D)+\operatorname {rank} \left(A-BD^{-1}C\right)$ (Haynsworth inertia additivity formula) If A is invertible, then the inertia of the block matrix M is equal to the inertia of A plus the inertia of M/A.

## Application to solving linear equations

The Schur complement arises naturally in solving a system of linear equations such as

{\begin{aligned}Ax+By&=a\\Cx+Dy&=b\end{aligned}} where x, a are p-dimensional column vectors, y, b are q-dimensional column vectors, A, B, C, D are as above, and D is invertible. Multiplying the bottom equation by ${\textstyle BD^{-1}}$ and then subtracting from the top equation one obtains

$\left(A-BD^{-1}C\right)x=a-BD^{-1}b.$ Thus if one can invert D as well as the Schur complement of D, one can solve for x, and then by using the equation ${\textstyle Cx+Dy=b}$ one can solve for y. This reduces the problem of inverting a ${\textstyle (p+q)\times (p+q)}$ matrix to that of inverting a p × p matrix and a q × q matrix. In practice, one needs D to be well-conditioned in order for this algorithm to be numerically accurate.

In electrical engineering this is often referred to as node elimination or Kron reduction.

## Applications to probability theory and statistics

Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn + m has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix

$\Sigma =\left[{\begin{matrix}A&B\\B^{\mathsf {T}}&C\end{matrix}}\right],$ where ${\textstyle A\in \mathbb {R} ^{n\times n}}$ is the covariance matrix of X, ${\textstyle C\in \mathbb {R} ^{m\times m}}$ is the covariance matrix of Y and ${\textstyle B\in \mathbb {R} ^{n\times m}}$ is the covariance matrix between X and Y.

Then the conditional covariance of X given Y is the Schur complement of C in ${\textstyle \Sigma }$ :

{\begin{aligned}\operatorname {Cov} (X\mid Y)&=A-BC^{-1}B^{\mathsf {T}}\\\operatorname {E} (X\mid Y)&=\operatorname {E} (X)+BC^{-1}(Y-\operatorname {E} (Y))\end{aligned}} If we take the matrix $\Sigma$ above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in $\Sigma$ also has a Wishart distribution.[citation needed]

## Conditions for positive definiteness and semi-definiteness

Let X be a symmetric matrix of real numbers given by

$X=\left[{\begin{matrix}A&B\\B^{\mathsf {T}}&C\end{matrix}}\right].$ Then

• If A is invertible, then X is positive definite if and only if A and its complement X/A are both positive definite:
$X\succ 0\Leftrightarrow A\succ 0,X/A=C-B^{\mathsf {T}}A^{-1}B\succ 0.$ • If C is invertible, then X is positive definite if and only if C and its complement X/C are both positive definite:
$X\succ 0\Leftrightarrow C\succ 0,X/C=A-BC^{-1}B^{\mathsf {T}}\succ 0.$ • If A is positive definite, then X is positive semi-definite if and only if the complement X/A is positive semi-definite:
${\text{If }}A\succ 0,{\text{ then }}X\succeq 0\Leftrightarrow X/A=C-B^{\mathsf {T}}A^{-1}B\succeq 0.$ • If C is positive definite, then X is positive semi-definite if and only if the complement X/C is positive semi-definite:
${\text{If }}C\succ 0,{\text{ then }}X\succeq 0\Leftrightarrow X/C=A-BC^{-1}B^{\mathsf {T}}\succeq 0.$ The first and third statements can be derived by considering the minimizer of the quantity

$u^{\mathsf {T}}Au+2v^{\mathsf {T}}B^{\mathsf {T}}u+v^{\mathsf {T}}Cv,\,$ as a function of v (for fixed u).

Furthermore, since

$\left[{\begin{matrix}A&B\\B^{\mathsf {T}}&C\end{matrix}}\right]\succ 0\Longleftrightarrow \left[{\begin{matrix}C&B^{\mathsf {T}}\\B&A\end{matrix}}\right]\succ 0$ and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third) statement.

There is also a sufficient and necessary condition for the positive semi-definiteness of X in terms of a generalized Schur complement. Precisely,

• $X\succeq 0\Leftrightarrow A\succeq 0,C-B^{\mathsf {T}}A^{g}B\succeq 0,\left(I-AA^{g}\right)B=0\,$ and
• $X\succeq 0\Leftrightarrow C\succeq 0,A-BC^{g}B^{\mathsf {T}}\succeq 0,\left(I-CC^{g}\right)B^{\mathsf {T}}=0,$ where $A^{g}$ denotes the generalized inverse of $A$ .