# Sylvester equation

In mathematics, in the field of control theory, a Sylvester equation is a matrix equation of the form:[1]

$A X + X B = C.$

Then given matrices A,B, and C, the problem is to find the possible matrices X that obey this equation. All matrices are assumed to have coefficients in the complex numbers. For the equation to make sense, the matrices must have appropriate sizes, for example they could all be square matrices of the same size. But more generally, we could take A and B must be square matrices of sizes n and m respectively, and then X and C both have n rows and m columns.

A Sylvester equation has a unique solution for X exactly when there are no common eigenvalues of A and -B. More generally, the equation AX+XB=C has been considered as an equation of bounded operators on a (possibly infinite-dimensional) Banach space. In this case, the condition for the uniqueness of a solution X is almost the same: There exists a unique solution X exactly when the spectra of A and -B are disjoint.[2]

## Existence and uniqueness of the solutions

Using the Kronecker product notation and the vectorization operator $\operatorname{vec}$, we can rewrite Sylvester's equation in the form

$(I_n \otimes A + B^T \otimes I_n) \operatorname{vec}X = \operatorname{vec}C,$

where $I_n$ is the $n \times n$ identity matrix. In this form, the equation can be seen as a linear system of dimension $n^2 \times n^2$.[3]

Proposition. Given complex $n\times n$ matrices $A$ and $B$, Sylvester's equation has a unique solution $X$ for all $C$ if and only if $A$ and $-B$ have no common eigenvalues.

Proof. Consider the linear transformation $S:M_n\rightarrow M_n$ given by $X\mapsto AX+XB$.

(i) Suppose that $A$ and $-B$ have no common eigenvalues. Then their characteristic polynomials $f(z)$ and $g(z)$ have highest common factor $1$. Hence there exist complex polynomials $p(z)$ and $q(z)$ such that $p(z)f(z)+q(z)g(z)=1$. By the Cayley–Hamilton theorem, $f(A)=0=g(-B)$; hence $g(A)q(A)=I$. Let $X$ be any solution of $S(X)=0$; so $AX=-XB$ and repeating this one sees that $X=q(A)g(A)X=q(A)Xg(-B)=0$. Hence by the rank plus nullity theorem $S$ is invertible, so for all $C$ there exists a unique solution $X$.

(ii) Conversely, suppose that $s$ is a common eigenvalue of $A$ and $-B$. Note that $s$ is also an eigenvalue of the transpose $A^T$. Then there exist non-zero vectors $v$ and $w$ such that $A^Tw=sw$ and $Bv=-sv$. Choose $C$ such that $Cv=\overline {w}$, the vector whose entries are the complex conjugates of $w$. Then $AX+XB=C$ has no solution $X$, as is clear from the complex bilinear pairing $< (AX+XB)v,w>==<\overline {w},w>$; the right-hand side is positive whereas the left is zero.

## Roth's removal rule

Given two square complex matrices A and B, of size n and m, and a matrix C of size n by m, then one can ask when the following two square matrices of size n+m are similar to each other: $\begin{bmatrix} A & C \\ 0 & B \end{bmatrix}$ and $\begin{bmatrix} A & 0 \\0&B \end{bmatrix}$. The answer is that these two matrices are similar exactly when there exists a matrix X such that AX-XB=C. In other words, X is a solution to a Sylvester equation. This is known as Roth's removal rule.[4]

One easily checks one direction: If AX-XB=C then

$\begin{bmatrix}I_n & X \\ 0 & I_m \end{bmatrix} \begin{bmatrix} A&C\\0&B \end{bmatrix} \begin{bmatrix} I_n & -X \\ 0& I_m \end{bmatrix} = \begin{bmatrix} A&0\\0&B \end{bmatrix}.$

Roth's removal rule does not generalize to infinite-dimensional bounded operators on a Banach space.[5]

## Numerical solutions

A classical algorithm for the numerical solution of the Sylvester equation is the Bartels–Stewart algorithm, which consists of transforming $A$ and $B$ into Schur form by a QR algorithm, and then solving the resulting triangular system via back-substitution. This algorithm, whose computational cost is O$(n^3)$ arithmetical operations, is used, among others, by LAPACK and the lyap function in GNU Octave. See also the syl function in that language.

## Notes

1. ^ This equation is also commonly written in the equivalent form of AX-XB=C.
2. ^ Bhatia and Rosenthal, 1997
3. ^ However, rewriting the equation in this form is not advised for the numerical solution since this version is costly to solve and can be ill-conditioned.
4. ^ Gerrish, F; Ward, A.G.B (Nov 1998). "Sylvester's matrix equation and Roth's removal rule". The Mathematical Gazette 82 (495): pp. 423-430.
5. ^ Bhatia and Rosenthal, p.3

## References

• Sylvester, J. (1884). "Sur l’equations en matrices $px = xq$". C. R. Acad. Sc. Paris 99 (2): 67–71, 115–116.
• Bartels, R. H.; Stewart, G. W. (1972). "Solution of the matrix equation $AX +XB = C$". Comm. ACM 15 (9): 820–826. doi:10.1145/361573.361582.
• Bhatia, R.; Rosenthal, P. (1997). "How and why to solve the operator equation $AX -XB = Y$ ?". Bull. London Math. Soc. 29 (1): 1–21. doi:10.1112/S0024609396001828.
• Lee, S.-G.; Vu, Q.-P. (2011). "Simultaneous solutions of Sylvester equations and idempotent matrices separating the joint spectrum". Linear Algebra Appl. 435 (9): 2097–2109. doi:10.1016/j.laa.2010.09.034.
• Birkhoff and MacLane. A survey of Modern Algebra. Macmillan. pp. 213, 299.