Perron–Frobenius theorem: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Undid revision 363640816 by Arcfrk (talk) See discussion on Mathematics Project
rv the revert: discussion at WPM indicates no support whatsoever for inclusion of these categories; please, convince us that they are ESSENTIAL rather than engage in revert warring!
Line 213: Line 213:


[[Category:Linear algebra]]
[[Category:Linear algebra]]
[[Category:Invariant subspaces]]
[[Category:Linear operators]]
[[Category:Operator theory]]
[[Category:Dynamical systems]]
[[Category:Dynamical systems]]
[[Category:Mathematical theorems]]
[[Category:Mathematical theorems]]

Revision as of 22:06, 23 May 2010

In linear algebra, the Perron–Frobenius theorem, named after Oskar Perron and Georg Frobenius, asserts that a real square matrix with positive entries has a unique largest real eigenvalue and that the corresponding eigenvector has strictly positive components. Similar statements apply to certain classes of nonnegative matrices. This theorem has important applications to probability theory (ergodicity of Markov chains) and to the theory of dynamical systems (subshifts of finite type).

Statement of the Perron–Frobenius theorem

A matrix in which all entries are positive real numbers is called positive and a matrix whose entries are non-negative real numbers is called non-negative. The eigenvalues of a real square matrix A are complex numbers and collectively they make up the spectrum of the matrix. The rate of growth of the matrix powers Ak as k → ∞ is controlled by the eigenvalue of A with the largest absolute value. The Perron–Frobenius theorem describes the properties of the leading eigenvalue and of the corresponding eigenvectors when A is a non-negative real square matrix. Early results were due to Oskar Perron (1907) and concerned positive matrices. Later, Georg Frobenius (1912) found their extension to certain classes of non-negative matrices.

Positive matrices

Let A = (aij) be an n × n positive matrix: aij > 0 for 1 ≤ i, jn. Then the following statements hold.

  1. There is a positive real number r, called the Perron root or the Perron–Frobenius eigenvalue, such that r is an eigenvalue of A and any other eigenvalue λ (possibly, complex) is strictly smaller than r in absolute value, |λ| < r. Thus, the spectral radius ρ(A) is equal to r.
  2. The Perron–Frobenius eigenvalue is simple: r is a simple root of the characteristic polynomial of A. Consequently, both the right eigenspace and the left eigenspace associated to r are one-dimensional.
  3. There exists a left (i.e. row) eigenvector v = (v1,…,vn) of A with eigenvalue r such that all components of v are positive: vA=rv, vi>0 for 1 ≤ in.
  4. Likewise, there exists a right (i.e. column) eigenvector w = (w1,…,wn)t of A with eigenvalue r such that all components of w are positive: Aw=rw, wi>0 for 1 ≤ in.
  5. The Perron–Frobenius eigenvalue satisfies the inequalities

The left and right eigenvectors v and w are usually normalized so that the sum of their components is equal to 1; in this case, they are sometimes called stochastic eigenvectors.

Non-negative matrices

A real square matrix A is primitive if A is non-negative and its kth power Ak is positive for some natural number k. A square matrix A is reducible if it can be conjugated into block upper triangular form by a permutation matrix P:

where E and G are square matrices. A square matrix is irreducible if it is not reducible.

A positive square matrix is primitive and a primitive matrix is irreducible. All statements of the Perron–Frobenius theorem for positive matrices remain true for primitive matrices. However, a non-negative irreducible matrix A may possess several eigenvalues whose absolute value is equal to the spectral radius of A, so the statements need to be correspondingly modified. This was done by Frobenius in 1912.

Let A be an irreducible non-negative n × n matrix with spectral radius ρ(A) = r. Then the following statements hold.

  1. The number r is a positive real number and it is an eigenvalue of the matrix A, called the Perron–Frobenius eigenvalue.
  2. The Perron–Frobenius eigenvalue r is simple. Both right and left eigenspaces associated with r are one-dimensional.
  3. A has a left eigenvector v with eigenvalue r whose components are all positive.
  4. Likewise, A has a right eigenvector w with eigenvalue r whose components are all positive.
  5. The only eigenvectors whose components are all positive are those associated with the eigenvalue r.
  6. Let h be the number of roots of the characteristic polynomial of A with absolute value r. Then each of them has multiplicity one and is the product of r with an hth root of unity.
  7. Let ω = 2π/h. Then the spectrum of A is invariant under multiplication by e (corresponding to the rotation of the complex plane by the angle ω). Consequently, the matrix A is similar to eA.
  8. If h > 1 then there exists a permutation matrix P such that
where the blocks along the main diagonal are zero square matrices.

If A is a non-negative primitive matrix then the number h is equal to 1. Moreover, this property characterizes non-negative primitive matrices among all non-negative irreducible matrices.[citation needed]

Applications

Numerous books have been written on the subject of non-negative matrices, and Perron–Frobenius theory is invariably a central feature. So there is a vast application area and the examples given below barely begin to scratch its surface.

Non-negative matrices

There is a common misconception that the Perron–Frobenius theorem applies to all non-negative matrices. The identity matrices provide trivial counter-examples, nevertheless the following piece of folklore (obtained by a process of iterative reduction) admits application of the theorem to any non-negative matrix ....

  • Any reducible square matrix A may be written in upper-triangular block form
PAP−1 =
where P is a permutation matrix and each Bi is a square matrix that is either irreducible or zero.

Now if A is non-negative then so are all the Bi and the spectrum of A is just the union of their spectra. Therefore many of the spectral properties of A may be deduced by applying the theorem to the irreducible Bi. For example it is evident that the Perron root is the maximum of the ρ(Bi) and that whilst there will still be eigenvectors with non-negative components it's quite possible that none of these will be positive. So the truth of the matter is that the Perron–Frobenius theorem can be applied to any non-negative matrix but not until after the above reduction has been performed.

Stochastic matrices

A row (column) stochastic matrix is a square matrix each of whose rows (columns) consists of non-negative real numbers whose sum is unity. Strictly speaking the theorem cannot be applied directly to such matrices because they need not be irreducible. If A is row-stochastic then the column vector with each entry 1 is clearly an eigenvector corresponding to the eigenvalue 1, which is also ρ(A) by the remark above. However it may not be the only eigenvalue on the unit circle and the associated eigenspace can be multi-dimensional. If A is row-stochastic and irreducible then the Perron projection is also row-stochastic and all its rows are equal. Stochastic matrices are extremely important in their own right, for example they are widely used by Internet search engines for ranking web pages.

Algebraic graph theory

The theorem has particular use in algebraic graph theory. The "underlying graph" of a nonnegative n-square matrix is the graph with vertices numbered 1, ... , n and arc ij if and only if Aij ≠ 0. If the underlying graph of such a matrix is strongly connected, then the matrix is irreducible, and thus the theorem applies. In particular, the adjacency matrix of a connected graph is irreducible. [1][2]

Finite Markov chains

The theorem has a natural interpretation in the theory of finite Markov chains (where it is the matrix-theoretic equivalent of the convergence of an irreducible finite Markov chain to its stationary distribution, formulated in terms of the transition matrix of the chain; see, for example, the article on the subshift of finite type). More generally, it is frequently applied in the theory of transfer operators, where it is commonly known as the Ruelle–Perron–Frobenius theorem (named after David Ruelle). In this case, the leading eigenvalue corresponds to the thermodynamic equilibrium of a dynamical system, and the lesser eigenvalues to the decay modes of a system that is not in equilibrium.

Towards a proof

There are many proofs of the theorem but none of them is easy. A common thread in many is the Brouwer fixed point theorem, itself a deep result. Another popular method is that of Wielandt (1950). He used the following result (known as the Collatz–Wielandt formula) to extend and clarify Frobenius's work.

  • Suppose A is a non-negative square matrix and for all non-negative non-zero vectors x let f(x) be the minimum value of [Ax]i / xi taken over all those i such that xi ≠ 0. Then the real-valued function f attains its maximum precisely at the Perron root.

In proving the Perron–Frobenius theorem, the following approach uses spectral theory.

Perron root

Start with a primitive A such that ρ(A) = 1. Let λ be any eigenvalue on the unit circle such that λ ≠ 1. Then there exists a positive integer m such that Am is a positive matrix and the real part of λm is negative. Let ε be half the smallest diagonal entry of Am and set T = Am − ε1 which is yet another positive matrix. Moreover if Ax = λx then Amx = λmx thus λm − ε is an eigenvalue of T. Because of the choice of m this point lies outside the unit disk consequently ρ(T) > 1. On the other hand all the entries in T are positive and less than or equal to those in Am so by Gelfand's formula ρ(T) ≤ ρ(Am) ≤ ρ(A)m = 1. This contradiction means that 1 must be an eigenvalue of A and there can be no other eigenvalues on the unit circle. Since for any A the spectral radius of A/ρ(A) is always 1 this result shows that ....

  • If A is primitive then its Perron root ρ(A) is the sole eigenvalue with modulus ρ(A).

Perron projection

The proof now proceeds using spectral decomposition. However unlike the classical Jordan form which partitions the spectrum into individual eigenvalues the trick here is to split the Perron root away from the other eigenvalues. The spectral projection associated with the Perron root is called the Perron projection and it enjoys the following remarkable property ....

  • The Perron projection of an irreducible non-negative square matrix is a positive matrix.

Perron's findings and also (1)–(5) of the theorem are corollaries of this deceptively simple result. The key point is that a positive projection always has rank one. This means that if A is an irreducible non-negative square matrix then the algebraic and geometric multiplicities of its Perron root are both one. Also if P is its Perron projection then AP = PA = ρ(A)P so every column of P is a positive right eigenvector of A and every row is a positive left eigenvector. Moreover if Ax = λx then PAx = λPx = ρ(A)Px which means Px = 0 if λ ≠ ρ(A). Thus the only positive eigenvectors are those associated with ρ(A). And there is yet more to come. If A is a primitive matrix with ρ(A) = 1 then it can be decomposed as P ⊕ (1 − P)A so that An = P + (1 − P)An. As n increases the second of these terms decays to zero leaving P as the limit of An as n → ∞.

The power method is a convenient way to compute the Perron projection of a primitive matrix. If v and w are the positive row and column vectors that it generates then the Perron projection is just wv/vw. It should be noted that the spectral projections aren't neatly blocked as in the Jordan form. Here they are overlaid on top of one another and each generally has complex entries extending to all four corners of the square matrix. Nevertheless they retain their mutual orthogonality which is what facilitates the decomposition.

Peripheral projection

The analysis when A is irreducible and non-negative is broadly similar. The Perron projection is still positive but there may now be other eigenvalues of modulus ρ(A) that negate use of the power method and prevent the powers of (1 − P)A decaying as in the primitive case whenever ρ(A) = 1. So enter the peripheral projection which is the spectral projection of A corresponding to all the eigenvalues that have modulus ρ(A) ....

  • The peripheral projection of an irreducible non-negative square matrix is a non-negative matrix with a positive diagonal.

Cyclicity

Suppose in addition that ρ(A) = 1 and A has h eigenvalues on the unit circle. If P is the peripheral projection then the matrix R = AP = PA is non-negative and irreducible, Rh = P, and the cyclic group P, R, R2, .... , Rh−1 represents the harmonics of A. The spectral projection of A at the eigenvalue λ on the unit circle is given by the formula . All of these projections (including the Perron projection) have the same positive diagonal, moreover choosing any one of them and then taking the modulus of every entry invariably yields the Perron projection. Some donkey work is still needed in order to establish the cyclic properties (6)–(8) but it's essentially just a matter of turning the handle. The spectral decomposition of A is given by A = R ⊕ (1 − P)A so the difference between An and Rn is An − Rn = (1 − P)An representing the transients of An which eventually decay to zero. P may be computed as the limit of Anh as n → ∞.

Caveats

The matrices L = , P = , T = , M = provide simple examples of what can go wrong if the necessary conditions are not met. It is easily seen that the Perron and peripheral projections of L are both equal to P, thus when the original matrix is reducible the projections may lose non-negativity and there is no chance of expressing them as limits of its powers. The matrix T is an example of a primitive matrix with zero diagonal. If the diagonal of an irreducible non-negative square matrix is non-zero then the matrix must be primitive but this example demonstrates that the converse is false. M is an example of a matrix with several missing spectral teeth. If ω = eiπ/3 then ω6 = 1 and the eigenvalues of M are {1,ω234} so ω and ω5 are both absent.

Terminology

A significant problem that inevitably causes confusion is a lack of standardisation in the definitions. For example some authors use the terms strictly positive and positive to mean > 0 and ≥ 0 respectively. In the interests of consistency this article takes its definitions from those elsewhere in Wikipedia. Thus positive here means > 0 and non-negative means ≥ 0.

The nonnegative eigenvector is often normalized so that the sum of its components is equal to unity; in this case, the eigenvector is a the vector of a probability distribution and so it sometimes called a stochastic eigenvector.

Perron–Frobenius eigenvalue and dominant eigenvalue are alternative names for the Perron root.

Spectral projections are also known as spectral projectors and spectral idempotents.

Another vexed area concerns decomposability and reducibility. It isn't unknown for the same author to swop their meanings in different books! Irreducible is a grossly overworked term. For avoidance of doubt a non-negative square matrix A such that 1 + A is primitive is sometimes said to be connected. Then irreducible non-negative square matrices and connected matrices are synonymous.[3]

Generalizations

For finite matrices, the Perron–Frobenius theory extends naturally to quasipositive matrices and essentially nonnegative matrices which occur in the study of continuous-time stochastic-processes. Related results occur in the theory of Z-matrices and M-matrices.

For countably infinite matrices and more general positive operators on ordered vector spaces, there are several extensions – e.g., Jentzsch's theorem, the Krein Rutman theorem, Lomonosov's theorem on the invariant subspace of a compact friendly operator – which are discussed in the books by Yuri Abramovich and Charalambos D. Aliprantis, by Garrett Birkhoff, by Mark Krasnosel'skii, and by Lyubich. These results are useful in the theory of Markov chains and Markov processes (Karlin, Seneta, Nummelin, Meyn and Tweedie).

The Gershgorin circle theorem also predicts the possible range of eigenvalues of general matrices.

References

  1. ^ Richard A. Brualdi and Herbert John Ryser. Combinatorial Matrix Theory. Cambridge UP.
  2. ^ Richard A. Brualdi and Dragos Cvetkovic. A Combinatorial Approach to Matrix Theory and Its Applications, CRC Press, Boca Raton FL, 2009.
  3. ^ For surveys of results on irreducibility, see Olga Taussky-Todd and Richard A. Brualdi.
  • Yuri A. Abramovich and Charalambos D. Aliprantis (2002). An Invitation to Operator Theory. American Mathematical Society. ISBN 978-0-8218-2146-6. {{cite book}}: External link in |publisher= (help)
  • A. Berman and R. J. Plemmons, Nonnegative Matrices in the Mathematical Sciences, Academic Press, 1979 (chapter 2), ISBN 0-12-092250-9
  • Abraham Berman, Robert J. Plemmons, Nonnegative Matrices in the Mathematical Sciences, 1994, SIAM. ISBN 0-89871-321-8.
  • Garrett Birkhoff 1967 (1940). Lattice Theory, 3rd ed. American Mathematical Society.
  • Garrett Birkhoff, Extensions of Jentzsch's theorem, Trans. Amer. Math. Soc. 85 (1957), 219–227 (There is a gap in the proof, which was diagnosed and corrected by Alexander M. Ostrowski, as noted in the third edition of Birkhoff's Lattice Theory.)
  • Frobenius, G. (1912), "Ueber Matrizen aus nicht negativen Elementen", Sitzungsber. Königl. Preuss. Akad. Wiss.: 456–477
  • C. Godsil and G. Royle, Algebraic Graph Theory, Springer, 2001 (chapter 8).
  • A. Graham, Nonnegative Matrices and Applicable Topics in Linear Algebra, John Wiley&Sons, New York, 1987.
  • R. A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, 1990 (chapter 8).
  • Samuel Karlin and H. M. Taylor. "A First Course in Stochastic Processes." Academic Press, 1975 (second edition).
  • Krasnosel'skii, M. A. (1964). Positive Solutions of Operator Equations. Groningen: P.Noordhoff Ltd. pp. 381 pp.
  • Krasnosel'skii, M. A.; Lifshits, Je.A.; Sobolev, A.V. (1990). Positive Linear Systems: The method of positive operators. Sigma Series in Applied Mathematics. Vol. 5. Berlin: Helderman Verlag. pp. 354 pp.
  • Yurii I. Lyubich. Introduction to the Theory of Banach Representations of Groups. Translated from the 1985 Russian-language edition (Kharkov, Ukraine). Birkhäuser Verlag. 1988.
  • S. P. Meyn and R. L. Tweedie. Markov Chains and Stochastic Stability. London: Springer-Verlag, 1993. ISBN 0-387-19832-6. online: https://netfiles.uiuc.edu/meyn/www/spm_files/book.html . Second edition to appear, Cambridge University Press, 2009.
  • Henryk Minc, Nonnegative matrices, John Wiley&Sons, New York, 1988, ISBN 0-471-83966-3
  • E. Nummelin. "General irreducible Markov chains and non-negative operators". Cambridge University Press, 1984, 2004. ISBN 0-521-60494-X
  • Perron, Oskar (1907), "Zur Theorie der Matrices", Mathematische Annalen, 64 (2): 248–263, doi:10.1007/BF01449896
  • Seneta, E. Non-negative matrices and Markov chains. 2nd rev. ed., 1981, XVI, 288 p., Softcover Springer Series in Statistics. (Originally published by Allen & Unwin Ltd., London, 1973) ISBN 978-0-387-29765-1
  • Suprunenko, D.A. (2001) [1994], "Perron–Frobenius theorem", Encyclopedia of Mathematics, EMS Press
  • Richard S. Varga 2002 Matrix Iterative Analysis, Second ed. (of 1962 Prentice Hall edition), Springer-Verlag.