Sherman–Morrison formula

In mathematics, in particular linear algebra, the Sherman–Morrison formula,[1][2][3] named after Jack Sherman and Winifred J. Morrison, computes the inverse of the sum of an invertible matrix ${\displaystyle A}$ and the outer product, ${\displaystyle uv^{\textsf {T}}}$, of vectors ${\displaystyle u}$ and ${\displaystyle v}$. The Sherman–Morrison formula is a special case of the Woodbury formula. Though named after Sherman and Morrison, it appeared already in earlier publications.[4]

Statement

Suppose ${\displaystyle A\in \mathbb {R} ^{n\times n}}$ is an invertible square matrix and ${\displaystyle u,v\in \mathbb {R} ^{n}}$ are column vectors. Then ${\displaystyle A+uv^{\textsf {T}}}$is invertible iff ${\displaystyle 1+v^{\textsf {T}}A^{-1}u\neq 0}$. In this case,

${\displaystyle \left(A+uv^{\textsf {T}}\right)^{-1}=A^{-1}-{A^{-1}uv^{\textsf {T}}A^{-1} \over 1+v^{\textsf {T}}A^{-1}u}.}$

Here, ${\displaystyle uv^{\textsf {T}}}$ is the outer product of two vectors ${\displaystyle u}$ and ${\displaystyle v}$. The general form shown here is the one published by Bartlett.[5]

Proof

(${\displaystyle \Leftarrow }$) To prove that the backward direction (${\displaystyle 1+v^{\textsf {T}}A^{-1}u\neq 0\Rightarrow A+uv^{\textsf {T}}}$ is invertible with inverse given as above) is true, we verify the properties of the inverse. A matrix ${\displaystyle Y}$ (in this case the right-hand side of the Sherman–Morrison formula) is the inverse of a matrix ${\displaystyle X}$ (in this case ${\displaystyle A+uv^{\textsf {T}}}$) if and only if ${\displaystyle XY=YX=I}$.

We first verify that the right hand side (${\displaystyle Y}$) satisfies ${\displaystyle XY=I}$.

{\displaystyle {\begin{aligned}XY&=\left(A+uv^{\textsf {T}}\right)\left(A^{-1}-{A^{-1}uv^{\textsf {T}}A^{-1} \over 1+v^{\textsf {T}}A^{-1}u}\right)\\[6pt]&=AA^{-1}+uv^{\textsf {T}}A^{-1}-{AA^{-1}uv^{\textsf {T}}A^{-1}+uv^{\textsf {T}}A^{-1}uv^{\textsf {T}}A^{-1} \over 1+v^{\textsf {T}}A^{-1}u}\\[6pt]&=I+uv^{\textsf {T}}A^{-1}-{uv^{\textsf {T}}A^{-1}+uv^{\textsf {T}}A^{-1}uv^{\textsf {T}}A^{-1} \over 1+v^{\textsf {T}}A^{-1}u}\\[6pt]&=I+uv^{\textsf {T}}A^{-1}-{u\left(1+v^{\textsf {T}}A^{-1}u\right)v^{\textsf {T}}A^{-1} \over 1+v^{\textsf {T}}A^{-1}u}\\[6pt]&=I+uv^{\textsf {T}}A^{-1}-uv^{\textsf {T}}A^{-1}\\[6pt]\end{aligned}}}

To end the proof of this direction, we need to show that ${\displaystyle YX=I}$ in a similar way as above:

${\displaystyle YX=\left(A^{-1}-{A^{-1}uv^{\textsf {T}}A^{-1} \over 1+v^{\textsf {T}}A^{-1}u}\right)(A+uv^{\textsf {T}})=I.}$

(In fact, the last step can be avoided since for square matrices ${\displaystyle X}$ and ${\displaystyle Y}$, ${\displaystyle XY=I}$ is equivalent to ${\displaystyle YX=I}$.)

(${\displaystyle \Rightarrow }$) Reciprocally, if ${\displaystyle 1+v^{\textsf {T}}A^{-1}u=0}$, then via the matrix determinant lemma, ${\displaystyle \det \!\left(A+uv^{\textsf {T}}\right)=(1+v^{\textsf {T}}A^{-1}u)\det(A)=0}$, so ${\displaystyle \left(A+uv^{\textsf {T}}\right)}$ is not invertible.

Application

If the inverse of ${\displaystyle A}$ is already known, the formula provides a numerically cheap way to compute the inverse of ${\displaystyle A}$ corrected by the matrix ${\displaystyle uv^{\textsf {T}}}$ (depending on the point of view, the correction may be seen as a perturbation or as a rank-1 update). The computation is relatively cheap because the inverse of ${\displaystyle A+uv^{\textsf {T}}}$ does not have to be computed from scratch (which in general is expensive), but can be computed by correcting (or perturbing) ${\displaystyle A^{-1}}$.

Using unit columns (columns from the identity matrix) for ${\displaystyle u}$ or ${\displaystyle v}$, individual columns or rows of ${\displaystyle A}$ may be manipulated and a correspondingly updated inverse computed relatively cheaply in this way.[6] In the general case, where ${\displaystyle A^{-1}}$ is a ${\displaystyle n}$-by-${\displaystyle n}$ matrix and ${\displaystyle u}$ and ${\displaystyle v}$ are arbitrary vectors of dimension ${\displaystyle n}$, the whole matrix is updated[5] and the computation takes ${\displaystyle 3n^{2}}$ scalar multiplications.[7] If ${\displaystyle u}$ is a unit column, the computation takes only ${\displaystyle 2n^{2}}$ scalar multiplications. The same goes if ${\displaystyle v}$ is a unit column. If both ${\displaystyle u}$ and ${\displaystyle v}$ are unit columns, the computation takes only ${\displaystyle n^{2}}$ scalar multiplications.

This formula also has application in theoretical physics. Namely, in quantum field theory, one uses this formula to calculate the propagator of a spin-1 field.[8][circular reference] The inverse propagator (as it appears in the Lagrangian) has the form ${\displaystyle \left(A+uv^{\textsf {T}}\right)}$. One uses the Sherman–Morrison formula to calculate the inverse (satisfying certain time-ordering boundary conditions) of the inverse propagator—or simply the (Feynman) propagator—which is needed to perform any perturbative calculation[9] involving the spin-1 field.

Alternative verification

Following is an alternate verification of the Sherman–Morrison formula using the easily verifiable identity

${\displaystyle \left(I+wv^{\textsf {T}}\right)^{-1}=I-{\frac {wv^{\textsf {T}}}{1+v^{\textsf {T}}w}}}$.

Let

${\displaystyle u=Aw,\quad {\text{and}}\quad A+uv^{\textsf {T}}=A\left(I+wv^{\textsf {T}}\right),}$

then

${\displaystyle \left(A+uv^{\textsf {T}}\right)^{-1}=\left(I+wv^{\textsf {T}}\right)^{-1}A^{-1}=\left(I-{\frac {wv^{\textsf {T}}}{1+v^{\textsf {T}}w}}\right)A^{-1}}$.

Substituting ${\displaystyle w=A^{-1}u}$ gives

${\displaystyle \left(A+uv^{\textsf {T}}\right)^{-1}=\left(I-{\frac {A^{-1}uv^{\textsf {T}}}{1+v^{\textsf {T}}A^{-1}u}}\right)A^{-1}=A^{-1}-{\frac {A^{-1}uv^{\textsf {T}}A^{-1}}{1+v^{\textsf {T}}A^{-1}u}}}$

Generalization (Woodbury matrix identity)

Given a square invertible ${\displaystyle n\times n}$ matrix ${\displaystyle A}$, an ${\displaystyle n\times k}$ matrix ${\displaystyle U}$, and a ${\displaystyle k\times n}$ matrix ${\displaystyle V}$, let ${\displaystyle B}$ be an ${\displaystyle n\times n}$ matrix such that ${\displaystyle B=A+UV}$. Then, assuming ${\displaystyle \left(I_{k}+VA^{-1}U\right)}$ is invertible, we have

${\displaystyle B^{-1}=A^{-1}-A^{-1}U\left(I_{k}+VA^{-1}U\right)^{-1}VA^{-1}.}$