In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. Abstractly, the matrix exponential gives the connection between a matrix Lie algebra and the corresponding Lie group.
The above series always converges, so the exponential of X is well-defined. Note that if X is a 1×1 matrix the matrix exponential of X is a 1×1 matrix consisting of the ordinary exponential of the single element of X.
- 1 Properties
- 2 Computing the matrix exponential
- 3 Calculations
- 4 Applications
- 5 See also
- 6 References
- 7 External links
Let X and Y be n×n complex matrices and let a and b be arbitrary complex numbers. We denote the n×n identity matrix by I and the zero matrix by 0. The matrix exponential satisfies the following properties:
- e0 = I
- eaXebX = e(a + b)X
- eXe−X = I
- If XY = YX then eXeY = eYeX = e(X + Y).
- If Y is invertible then eYXY−1 =YeXY−1.
- exp(XT) = (exp X)T, where XT denotes the transpose of X. It follows that if X is symmetric then eX is also symmetric, and that if X is skew-symmetric then eX is orthogonal.
- exp(X*) = (exp X)*, where X* denotes the conjugate transpose of X. It follows that if X is Hermitian then eX is also Hermitian, and that if X is skew-Hermitian then eX is unitary.
Linear differential equation systems
One of the reasons for the importance of the matrix exponential is that it can be used to solve systems of linear ordinary differential equations. The solution of
where A is a constant matrix, is given by
The matrix exponential can also be used to solve the inhomogeneous equation
See the section on applications below for examples.
There is no closed-form solution for differential equations of the form
where A is not constant, but the Magnus series gives the solution as an infinite sum.
The exponential of sums
We know that the exponential function satisfies ex+y = ex ey for any real numbers (scalars) x and y. The same goes for commuting matrices: If the matrices X and Y commute (meaning that XY = YX), then
However, if they do not commute, then the above equality does not necessarily hold, in which case the Baker–Campbell–Hausdorff formula furnishes eX+Y.
The converse is false: the equation eX+Y = eX eY does not necessarily imply that X and Y commute.
If A and H are Hermitian matrices, then
Note that there is no requirement of commutativity. There are counterexamples to show that the Golden–Thompson inequality cannot be extended to three matrices−−and, in any event, tr(exp(A)exp(B)exp(C)) is not guaranteed to be real for Hermitian A , B, C. However, the next theorem accomplishes this in a way.
The exponential map
Note that the exponential of a matrix is always an invertible matrix. The inverse matrix of eX is given by e−X. This is analogous to the fact that the exponential of a complex number is always nonzero. The matrix exponential then gives us a map
from the space of all n×n matrices to the general linear group of degree n, i.e. the group of all n×n invertible matrices. In fact, this map is surjective which means that every invertible matrix can be written as the exponential of some other matrix (for this, it is essential to consider the field C of complex numbers and not R).
For any two matrices X and Y,
defines a smooth curve in the general linear group which passes through the identity element at t = 0.
In fact, this gives a one-parameter subgroup of the general linear group since
The derivative of this curve (or tangent vector) at a point t is given by
The derivative at t = 0 is just the matrix X, which is to say that X generates this one-parameter subgroup.
Taking in above expression eX(t) outside the integral sign and expanding the integrand with the help of the Hadamard lemma one can obtain the following useful expression for the derivative of matrix exponent,
The determinant of the matrix exponential
By Jacobi's formula, for any complex square matrix the following identity holds:
In addition to providing a computational tool, this formula demonstrates that a matrix exponential is always an invertible matrix. This follows from the fact the right hand side of the above equation is always non-zero, and so det(eA)≠ 0, which means that eA must be invertible.
In the real-valued case, the formula also exhibits the map
to not be surjective, in contrast to the complex case mentioned earlier. This follows from the fact that, for real-valued matrices, the right-hand side of the formula is always positive, while there exist invertible matrices with a negative determinant.
Computing the matrix exponential
Finding reliable and accurate methods to compute the matrix exponential is difficult, and this is still a topic of considerable current research in mathematics and numerical analysis. Both Matlab and GNU Octave use Padé approximant. Several methods are listed below.
If a matrix is diagonal:
then its exponential can be obtained by just exponentiating every entry on the main diagonal:
This also allows one to exponentiate diagonalizable matrices. If A = UDU−1 and D is diagonal, then eA = UeDU−1. Application of Sylvester's formula yields the same result. The proof behind this is that multiplication between diagonal matrices is equivalent to element wise multiplication; in particular, the "one dimensional" exponentiation is felt element wise for the diagonal case.
If the matrix under question is a projection matrix (idempotent), then the matrix exponential of it is eP = I + (e − 1)P, which is easy to show upon expansion of the definition of the exponential,
A matrix N is nilpotent if Nq = 0 for some integer q. In this case, the matrix exponential eN can be computed directly from the series expansion, as the series terminates after a finite number of terms:
When the minimal polynomial of a matrix X can be factored into a product of first degree polynomials, it can be expressed as a sum
- A is diagonalizable
- N is nilpotent
- A commutes with N (i.e. AN = NA)
This is the Jordan–Chevalley decomposition.
This means that we can compute the exponential of X by reducing to the previous two cases:
Note that we need the commutativity of A and N for the last step to work.
Therefore, we need only know how to compute the matrix exponential of a Jordan block. But each Jordan block is of the form
where N is a special nilpotent matrix. The matrix exponential of this block is given by
Evaluation by Laurent series
If P and Qt are nonzero polynomials in one variable, such that P(A) = 0, and if the meromorphic function
is entire, then
To prove this, multiply the first of the two above equalities by P(z) and replace z by A.
Such a polynomial Qt can be found as follows. Let a be a root of P, and Qa,t the product of P by the principal part of the Laurent series of f at a. Then the sum St of the Qa,t, where a runs over all the roots of P, can be taken as a particular Qt. All the other Qt will be obtained by adding a multiple of P to St. In particular St is the only Qt whose degree is less than that of P.
Example: Consider the case of an arbitrary 2-by-2 matrix,
The exponential matrix , by virtue of the Cayley–Hamilton theorem, must be of the form
(For any complex number z and any C-algebra B, we denote again by z the product of z by the unit of B.) Let α and β be the roots of the characteristic polynomial of A,
Then we have
if , and
In either case, writing:
- is 0 if t = 0, and 1 if q = 0.
Thus, as indicated above, the matrix A having decomposed into the sum of two mutually commuting pieces, the traceful piece and the traceless piece,
the matrix exponential reduces to a plain product of the exponentials of the two respective pieces. This is a formula often used in physics, as it amounts to the analog of Euler's formula for Pauli spin matrices, that is rotations of the doublet representation of the group SU(2).
The polynomial can also be given the following "interpolation" characterization. Put , . Then is the unique degree <n polynomial which satisfies whenever k is less than the multiplicity of a as a root of P. We assume (as we obviously can) that P is the minimal polynomial of A. We also assume that A is a diagonalizable matrix. In particular, the roots of P are simple, and the "interpolation" characterization tells us that is given by the Lagrange interpolation formula.
At the other extreme, if , then
The simplest case not covered by the above observations is when with , which gives
Suppose that we want to compute the exponential of
Its Jordan form is
where the matrix P is given by
Let us first calculate exp(J). We have
The exponential of a 1×1 matrix is just the exponential of the one entry of the matrix, so exp(J1(4)) = [e4]. The exponential of J2(16) can be calculated by the formula e(λI + N) = eλ eN mentioned above; this yields
Therefore, the exponential of the original matrix B is
Linear differential equations
The matrix exponential has applications to systems of linear differential equations. (See also matrix differential equation.) Recall from earlier in this article that a differential equation of the form
has solution eCty(0). If we consider the vector
we can express a system of coupled linear differential equations as
If we make an ansatz and use an integrating factor of e−At and multiply throughout, we obtain
The second step is possible due to the fact that if AB = BA then . If we can calculate eAt, then we can obtain the solution to the system.
Say we have the system
We have the associated matrix
The matrix exponential
so the general solution of the system is
Inhomogeneous case – variation of parameters
For yp to be a solution:
where c is determined by the initial conditions of the problem.
More precisely, consider the equation
with the initial condition , where
is an by complex matrix,
is a continuous function from some open interval to ,
is a point of , and
is a vector of .
Left multiplying the above displayed equality by , we get
We claim that the solution to the equation
with the initial conditions for is
where the notation is as follows:
is a monic polynomial of degree ,
is a continuous complex valued function defined on some open interval ,
is a point of ,
is a complex number, and
is the coefficient of in the polynomial denoted by in Subsection Alternative above.
To justify this claim, we transform our order n scalar equation into an order one vector equation by the usual reduction to a first order system. Our vector equation takes the form
In the case we get the following statement. The solution to
where the functions and are as in Subsection Alternative above.
Say we have the system
So we then have
From before, we have the general solution to the homogeneous equation, Since the sum of the homogeneous and particular solutions give the general solution to the inhomogeneous problem, now we only need to find the particular solution (via variation of parameters).
We have, above:
which can be further simplified to get the requisite particular solution determined through variation of parameters.
- Bhatia, R. (1997). Matrix Analysis. Graduate Texts in Mathematics 169. Springer. ISBN 978-0-387-94846-1.
- E. H. Lieb (1973). "Convex trace functions and the Wigner–Yanase–Dyson conjecture". Adv. Math. 11 (3): 267–288. doi:10.1016/0001-8708(73)90011-X. H. Epstein (1973). "Remarks on two theorems of E. Lieb". Commun Math. Phys. 31 (4): 317–325. doi:10.1007/BF01646492.
- R. M. Wilcox (1967). "Exponential Operators and Parameter Differentiation in Quantum Physics". Journal of Mathematical Physics 8 (4): 962–982. doi:10.1063/1.1705306.
- "Matrix exponential - MATLAB expm - MathWorks Deutschland". Mathworks.de. 2011-04-30. Retrieved 2013-06-05.
- "GNU Octave - Functions of a Matrix". Network-theory.co.uk. 2007-01-11. Retrieved 2013-06-05.
- This can be generalized; in general, the exponential of Jn(a) is an upper triangular matrix with ea/0! on the main diagonal, ea/1! on the one above, ea/2! on the next one, and so on.
- Horn, Roger A.; Johnson, Charles R. (1991). Topics in Matrix Analysis. Cambridge University Press. ISBN 978-0-521-46713-1..
- Moler, Cleve; Van Loan, Charles F. (2003). "Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later". SIAM Review 45 (1): 3–49. doi:10.1137/S00361445024180. ISSN 1095-7200..