Arnoldi iteration

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of an iterative method. Arnoldi finds an approximation to the eigenvalues and eigenvectors of general (possibly non-Hermitian) matrices by constructing an orthonormal basis of the Krylov subspace, which makes it particularly useful when dealing with large sparse matrices.

The Arnoldi method belongs to a class of linear algebra algorithms that give a partial result after a small number of iterations, in contrast to so-called direct methods which must complete to give any useful results (see for example, Householder transformation). The partial result in this case being the first few vectors of the basis the algorithm is building.

When applied to Hermitian matrices it reduces to the Lanczos algorithm. The Arnoldi iteration was invented by W. E. Arnoldi in 1951.[1]

Krylov subspaces and the power iteration[edit]

An intuitive method for finding the largest (in absolute value) eigenvalue of a given m × m matrix is the power iteration: starting with an arbitrary initial vector b, calculate Ab, A2b, A3b,… normalizing the result after every application of the matrix A.

This sequence converges to the eigenvector corresponding to the eigenvalue with the largest absolute value, . However, much potentially useful computation is wasted by using only the final result, . This suggests that instead, we form the so-called Krylov matrix:

The columns of this matrix are not in general orthogonal, but we can extract an orthogonal basis, via a method such as Gram–Schmidt orthogonalization. The resulting set of vectors is thus an orthogonal basis of the Krylov subspace, . We may expect the vectors of this basis to give good approximations of the eigenvectors corresponding to the largest eigenvalues, for the same reason that approximates the dominant eigenvector.

The Arnoldi iteration[edit]

The Arnoldi iteration uses the stabilized Gram–Schmidt process to produce a sequence of orthonormal vectors, q1, q2, q3, …, called the Arnoldi vectors, such that for every n, the vectors q1, …, qn span the Krylov subspace . Explicitly, the algorithm is as follows:

  • Start with an arbitrary vector q1 with norm 1.
  • Repeat for k = 2, 3, …
    • for j from 1 to k − 1

The j-loop projects out the component of in the directions of . This ensures the orthogonality of all the generated vectors.

The algorithm breaks down when qk is the zero vector. This happens when the minimal polynomial of A is of degree k. In most applications of the Arnoldi iteration, including the eigenvalue algorithm below and GMRES, the algorithm has converged at this point.

Every step of the k-loop takes one matrix-vector product and approximately 4mk floating point operations.

In the programming language python:

 1 import numpy as np
 2 def arnoldi_iteration(A, b, n):
 3     """
 4     Computes a basis of the (n+1)-Krylov subspace of A: the space
 5     spanned by {b, Ab, ..., A^n b}.
 6 
 7     Input
 8     A: mxm array
 9     b: initial vector (length m)
10     n: dimension of Krylov subspace, must be >=1
11     
12     Returns Q, h
13     Q: mx(n+1) array, the columns are an orthonormal basis of the
14     Krylov subspace.
15     h: (n+1)xn array, A on basis Q. It is upper Hessenberg.  
16     """
17     m = A.shape[0] 
18 
19     h = np.zeros((n+1, n))    
20     Q = np.zeros((m, n+1))      
21 
22     q = b/np.linalg.norm(b)     # Normalize the input vector
23     Q[:, 0] = q                 # Use it as the first Krylov vector
24     
25     for k in range(n):           
26         v = A.dot(q)            # Generate a new candidate vector 
27         for j in range(k+1):    # Subtract the projections on previous vectors
28             h[j, k] = np.dot(Q[:, j].conj(),v)
29             v = v - h[j, k]*Q[:, j]   
30 
31         h[k+1, k] = np.linalg.norm(v)
32         eps = 1e-12             # If v is shorter than this threshold it is the zero vector
33         if h[k+1,k] > eps:      # Add the produced vector to the list, unless
34             q = v / h[k+1, k]   #   the zero vector is produced.
35             Q[:, k+1] = q
36         else:                   # If that happens, stop iterating.
37             return Q,h
38     return Q,h

Properties of the Arnoldi iteration[edit]

Let Qn denote the m-by-n matrix formed by the first n Arnoldi vectors q1, q2, …, qn, and let Hn be the (upper Hessenberg) matrix formed by the numbers hj,k computed by the algorithm:

We then have

This yields an alternative interpretation of the Arnoldi iteration as a (partial) orthogonal reduction of A to Hessenberg form. The matrix Hn can be viewed as the representation in the basis formed by the Arnoldi vectors of the orthogonal projection of A onto the Krylov subspace .

The matrix Hn can be characterized by the following optimality condition. The characteristic polynomial of Hn minimizes ||p(A)q1||2 among all monic polynomials of degree n. This optimality problem has a unique solution if and only if the Arnoldi iteration does not break down.

The relation between the Q matrices in subsequent iterations is given by

where

is an (n+1)-by-n matrix formed by adding an extra row to Hn.

Finding eigenvalues with the Arnoldi iteration[edit]

The idea of the Arnoldi iteration as an eigenvalue algorithm is to compute the eigenvalues of the orthogonal projection of A onto the Krylov subspace. This projection is represented by Hn. The eigenvalues of Hn are called the Ritz eigenvalues. Since Hn is a Hessenberg matrix of modest size, its eigenvalues can be computed efficiently, for instance with the QR algorithm. This is an example of the Rayleigh-Ritz method.

It is often observed in practice that some of the Ritz eigenvalues converge to eigenvalues of A. Since Hn is n-by-n, it has at most n eigenvalues, and not all eigenvalues of A can be approximated. Typically, the Ritz eigenvalues converge to the extreme eigenvalues of A. This can be related to the characterization of Hn as the matrix whose characteristic polynomial minimizes ||p(A)q1|| in the following way. A good way to get p(A) small is to choose the polynomial p such that p(x) is small whenever x is an eigenvalue of A. Hence, the zeros of p (and thus the Ritz eigenvalues) will be close to the eigenvalues of A.

However, the details are not fully understood yet. This is in contrast to the case where A is symmetric. In that situation, the Arnoldi iteration becomes the Lanczos iteration, for which the theory is more complete.

Implicitly restarted Arnoldi method (IRAM)[edit]

Due to practical storage consideration, common implementations of Arnoldi methods typically restart after some number of iterations. One major innovation in restarting was due to Lehoucq and Sorensen who proposed the Implicitly Restarted Arnoldi Method.[2] They also implemented the algorithm in a freely available software package called ARPACK.[3] This has spurred a number of other variations including Implicitly Restarted Lanczos method.[4][5][6] It also influenced how other restarted methods are analyzed.[7] Theoretical results have shown that convergence improves with an increase in the Krylov subspace dimension n. However, an a-priori value of n which would lead to optimal convergence is not known. Recently a dynamic switching strategy[8] has been proposed which fluctuates the dimension n before each restart and thus leads to acceleration in the rate of convergence.

See also[edit]

The generalized minimal residual method (GMRES) is a method for solving Ax = b based on Arnoldi iteration.

References[edit]

  1. ^ W. E. Arnoldi, "The principle of minimized iterations in the solution of the matrix eigenvalue problem," Quarterly of Applied Mathematics, volume 9, pages 17–29, 1951
  2. ^ R. B. Lehoucq & D. C. Sorensen (1996). "Deflation Techniques for an Implicitly Restarted Arnoldi Iteration". SIAM. doi:10.1137/S0895479895281484. Missing or empty |url= (help)
  3. ^ R. B. Lehoucq; D. C. Sorensen & C. Yang (1998). "ARPACK Users Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods". SIAM.
  4. ^ D. CALVETTI; L. REICHEL & D.C. SORENSEN (1994). "An Implicitly Restarted Lanczos Method for Large Symmetric Eigenvalue Problems". ETNA.
  5. ^ E. Kokiopoulou; C. Bekas & E. Gallopoulos (2003). "An Implicitly Restarted Lanczos Bidiagonalization Method for Computing Smallest Singular Triplets" (PDF). SIAM.
  6. ^ Zhongxiao Jia (2002). "The refined harmonic Arnoldi method and an implicitly restarted refined algorithm for computing interior eigenpairs of large matrices". Appl. Numer. Math. doi:10.1016/S0168-9274(01)00132-5. Missing or empty |url= (help)
  7. ^ Andreas Stathopoulos and Yousef Saad and Kesheng Wu (1998). "Dynamic Thick Restarting of the Davidson, and the Implicitly Restarted Arnoldi Methods". SIAM. doi:10.1137/S1064827596304162. Missing or empty |url= (help)
  8. ^ K.Dookhitram, R. Boojhawon & M. Bhuruth (2009). "A New Method For Accelerating Arnoldi Algorithms For Large Scale Eigenproblems". Math. Comput. Simulat. doi:10.1016/j.matcom.2009.07.009. Missing or empty |url= (help)
  • W. E. Arnoldi, "The principle of minimized iterations in the solution of the matrix eigenvalue problem," Quarterly of Applied Mathematics, volume 9, pages 17–29, 1951.
  • Yousef Saad, Numerical Methods for Large Eigenvalue Problems, Manchester University Press, 1992. ISBN 0-7190-3386-1.
  • Lloyd N. Trefethen and David Bau, III, Numerical Linear Algebra, Society for Industrial and Applied Mathematics, 1997. ISBN 0-89871-361-7.
  • Jaschke, Leonhard: Preconditioned Arnoldi Methods for Systems of Nonlinear Equations. (2004). ISBN 2-84976-001-3
  • Implementation: Matlab comes with ARPACK built-in. Both stored and implicit matrices can be analyzed through the eigs() function.