# Preconditioner

"Preconditioning" redirects here. For other uses, see Preconditioning (disambiguation).

In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solving methods. Preconditioning is typically related to reducing a condition number of the problem. The preconditioned problem is then usually solved by an iterative method.

## Preconditioning for linear systems

In linear algebra and numerical analysis, a preconditioner ${\displaystyle P}$ of a matrix ${\displaystyle A}$ is a matrix such that ${\displaystyle P^{-1}A}$ has a smaller condition number than ${\displaystyle A}$. It is also common to call ${\displaystyle T=P^{-1}}$ the preconditioner, rather than ${\displaystyle P}$, since ${\displaystyle P}$ itself is rarely explicitly available. In modern preconditioning, the application of ${\displaystyle T=P^{-1}}$, i.e., multiplication of a column vector, or a block of column vectors, by ${\displaystyle T=P^{-1}}$, is commonly performed by rather sophisticated computer software packages in a matrix-free fashion, i.e., where neither ${\displaystyle P}$, nor ${\displaystyle T=P^{-1}}$ (and often not even ${\displaystyle A}$) are explicitly available in a matrix form.

Preconditioners are useful in iterative methods to solve a linear system ${\displaystyle Ax=b}$ for ${\displaystyle x}$ since the rate of convergence for most iterative linear solvers increases as the condition number of a matrix decreases as a result of preconditioning. Preconditioned iterative solvers typically outperform direct solvers, e.g., Gaussian elimination, for large, especially for sparse, matrices. Iterative solvers can be used as matrix-free methods, i.e. become the only choice if the coefficient matrix ${\displaystyle A}$ is not stored explicitly, but is accessed by evaluating matrix-vector products.

### Description

Instead of solving the original linear system above, one may solve either the right preconditioned system:

${\displaystyle AP^{-1}Px=b}$

via solving

${\displaystyle AP^{-1}y=b}$

for ${\displaystyle y}$ and

${\displaystyle Px=y}$

for ${\displaystyle x}$; or the left preconditioned system:

${\displaystyle P^{-1}(Ax-b)=0}$

both of which give the same solution as the original system so long as the preconditioner matrix ${\displaystyle P}$ is nonsingular. The left preconditioning is more common.

The goal of this preconditioned system is to reduce the condition number of the left or right preconditioned system matrix ${\displaystyle P^{-1}A}$ or ${\displaystyle AP^{-1},}$ respectively. The preconditioned matrix ${\displaystyle P^{-1}A}$ or ${\displaystyle AP^{-1}}$ is almost never explicitly formed. Only the action of applying the preconditioner solve operation ${\displaystyle P^{-1}}$ to a given vector need to be computed in iterative methods.

Typically there is a trade-off in the choice of ${\displaystyle P}$. Since the operator ${\displaystyle P^{-1}}$ must be applied at each step of the iterative linear solver, it should have a small cost (computing time) of applying the ${\displaystyle P^{-1}}$ operation. The cheapest preconditioner would therefore be ${\displaystyle P=I}$ since then ${\displaystyle P^{-1}=I.}$ Clearly, this results in the original linear system and the preconditioner does nothing. At the other extreme, the choice ${\displaystyle P=A}$ gives ${\displaystyle P^{-1}A=AP^{-1}=I,}$ which has optimal condition number of 1, requiring a single iteration for convergence; however in this case ${\displaystyle P^{-1}=A^{-1},}$ and applying the preconditioner is as difficult as solving the original system. One therefore chooses ${\displaystyle P}$ as somewhere between these two extremes, in an attempt to achieve a minimal number of linear iterations while keeping the operator ${\displaystyle P^{-1}}$ as simple as possible. Some examples of typical preconditioning approaches are detailed below.

### Preconditioned iterative methods

Preconditioned iterative methods for ${\displaystyle Ax-b=0}$ are, in most cases, mathematically equivalent to standard iterative methods applied to the preconditioned system ${\displaystyle P^{-1}(Ax-b)=0.}$ For example, the standard Richardson iteration for solving ${\displaystyle Ax-b=0}$ is

${\displaystyle \mathbf {x} _{n+1}=\mathbf {x} _{n}-\gamma _{n}(A\mathbf {x} _{n}-\mathbf {b} ),\ n\geq 0.}$

Applied to the preconditioned system ${\displaystyle P^{-1}(Ax-b)=0,}$ it turns into a preconditioned method

${\displaystyle \mathbf {x} _{n+1}=\mathbf {x} _{n}-\gamma _{n}P^{-1}(A\mathbf {x} _{n}-\mathbf {b} ),\ n\geq 0.}$

Examples of popular preconditioned iterative methods for linear systems include the preconditioned conjugate gradient method, the biconjugate gradient method, and generalized minimal residual method. Iterative methods, which use scalar products to compute the iterative parameters, require corresponding changes in the scalar product together with substituting ${\displaystyle P^{-1}(Ax-b)=0}$ for ${\displaystyle Ax-b=0.}$

### Geometric interpretation

For a symmetric positive definite matrix ${\displaystyle A}$ the preconditioner ${\displaystyle P}$ is typically chosen to be symmetric positive definite as well. The preconditioned operator ${\displaystyle P^{-1}A}$ is then also symmetric positive definite, but with respect to the ${\displaystyle P}$-based scalar product. In this case, the desired effect in applying a preconditioner is to make the quadratic form of the preconditioned operator ${\displaystyle P^{-1}A}$ with respect to the ${\displaystyle P}$-based scalar product to be nearly spherical [1].

### Variable and non-linear preconditioning

Denoting ${\displaystyle T=P^{-1}}$, we highlight that preconditioning is practically implemented as multiplying some vector ${\displaystyle r}$ by ${\displaystyle T}$, i.e., computing the product ${\displaystyle Tr.}$ In many applications, ${\displaystyle T}$ is not given as a matrix, but rather as an operator ${\displaystyle T(r)}$ acting on the vector ${\displaystyle r}$. Some popular preconditioners, however, change with ${\displaystyle r}$ and the dependence on ${\displaystyle r}$ may not be linear. Typical examples involve using non-linear iterative methods, e.g., the conjugate gradient method, as a part of the preconditioner construction. Such preconditioners may be practically very efficient, however, their behavior is hard to predict theoretically.

### Spectrally equivalent preconditioning

The most common use of preconditioning is for iterative solution of linear systems resulting from approximations of partial differential equations. The better the approximation quality, the larger the matrix size is. In such a case, the goal of optimal preconditioning is, on the one side, to make the spectral condition number of ${\displaystyle P^{-1}A}$ to be bounded from above by a constant independent of the matrix size, which is called spectrally equivalent preconditioning by D'yakonov. On the other hand, the cost of application of the ${\displaystyle P^{-1}}$ should ideally be proportional (also independent of the matrix size) to the cost of multiplication of ${\displaystyle A}$ by a vector.

### Examples

#### Jacobi (or diagonal) preconditioner

The Jacobi preconditioner is one of the simplest forms of preconditioning, in which the preconditioner is chosen to be the diagonal of the matrix ${\displaystyle P=\mathrm {diag} (A).}$ Assuming ${\displaystyle A_{ii}\neq 0,\forall i}$, we get ${\displaystyle P_{ij}^{-1}={\frac {\delta _{ij}}{A_{ij}}}.}$ It is efficient for diagonally dominant matrices ${\displaystyle A}$.

#### SPAI

The Sparse Approximate Inverse preconditioner minimises ${\displaystyle \|AT-I\|_{F},}$ where ${\displaystyle \|\cdot \|_{F}}$ is the Frobenius norm and ${\displaystyle T=P^{-1}}$ is from some suitably constrained set of sparse matrices. Under the Frobenius norm, this reduces to solving numerous independent least-squares problems (one for every column). The entries in ${\displaystyle T}$ must be restricted to some sparsity pattern or the problem remains as difficult and time-consuming as finding the exact inverse of ${\displaystyle A}$. The method was introduced by M.J. Grote and T. Huckle together with an approach to selecting sparsity patterns.[1]