LOBPCG: Difference between revisions

Content deleted Content added

Inline

Revision as of 19:47, 13 July 2018

Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is a matrix-free method for finding the largest (or smallest) eigenvalues and the corresponding eigenvectors of a symmetric positive definite generalized eigenvalue problem

Ax=\lambda Bx,

for a given pair $(A,B)$ of complex Hermitian or real symmetric matrices, where the matrix $B$ is also assumed positive-definite.

Algorithm

The method performs an iterative maximization (or minimization) of the generalized Rayleigh quotient

\rho (x):=\rho (A,B;x):={\frac {x^{T}Ax}{x^{T}Bx}},

which results in finding largest (or smallest) eigenpairs of $Ax=\lambda Bx.$

The direction of the steepest ascent, which is the gradient, of the generalized Rayleigh quotient is positively proportional to the vector

r:=Ax-\rho (x)Bx,

called the eigenvector residual. If a preconditioner $T$ is available, it is applied to the residual giving vector

w:=Tr,

called the preconditioned residual. Without preconditioning, we set $T:=I$ and so $w:=r,$ . An iterative method

x^{i+1}:=x^{i}+\alpha ^{i}T(Ax^{i}-\rho (x^{i})Bx^{i}),

or, in short,

x^{i+1}:=x^{i}+\alpha ^{i}w^{i},\,

w^{i}:=Tr^{i},\,

r^{i}:=Ax^{i}-\rho (x^{i})Bx^{i},

is known as preconditioned steepest ascent (or descent), where the scalar $\alpha ^{i}$ is called the step size. The optimal step size can be determined by maximizing the Rayleigh quotient, i.e.,

x^{i+1}:=\arg \max _{y\in span\{x^{i},w^{i}\}}\rho (y)

(or $\arg \min$ in case of minimizing), in which case the method is called locally optimal. To further accelerate the convergence of the locally optimal preconditioned steepest ascent (or descent), one can add one extra vector to the two-term recurrence relation to make it three-term:

x^{i+1}:=\arg \max _{y\in span\{x^{i},w^{i},x^{i-1}\}}\rho (y)

(use $\arg \min$ in case of minimizing). The maximization/minimization of the Rayleigh quotient in a 3-dimensional subspace can be performed numerically by the Rayleigh–Ritz method. As the iterations converge, the vectors $x^{i}$ and $x^{i-1}$ become nearly linearly dependent, making the Rayleigh–Ritz method numerically unstable in the presence of round-off errors. It is possible to substitute the vector $x^{i-1}$ with an explicitly computed difference $p^{i}=x^{i-1}-x^{i}$ making the Rayleigh–Ritz method more stable; see.^[1]

This is a single-vector version of the LOBPCG method. It is one of possible generalization of the preconditioned conjugate gradient linear solvers to the case of symmetric eigenvalue problems. Even in the trivial case $T=I$ and $B=I$ the resulting approximation with $i>3$ will be different from that obtained by the Lanczos algorithm, although both approximations will belong to the same Krylov subspace.

Iterating several approximate eigenvectors together in a block in a similar locally optimal fashion, gives the full block version of the LOBPCG. It allows robust computation of eigenvectors corresponding to nearly-multiple eigenvalues.

General software implementations

LOBPCG's inventor, Andrew Knyazev, published an implementation called Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) with interfaces to PETSc and hypre.^[2] Other implementations are available in, e.g., Octave, MATLAB, Java, Anasazi (Trilinos), SLEPc, SciPy, MAGMA^[3] and NVIDIA AMGX.

Applications

Material Sciences

LOBPCG is implemented in ABINIT (including CUDA version) and Octopus. It has been used for multi-billion size matrices by Gordon Bell Prize finalists, on the Earth Simulator supercomputer in Japan.^[4]^[5] Recent implementations include TTPY,^[6] Platypus‐QM,^[7] and MFDn.^[8] Hubbard model for strongly-correlated electron systems to understand the mechanism behind the superconductivity uses LOBPCG to calculate the ground state of the Hamiltonian on the K computer. ^[9]

Maxwell's equations

LOBPCG is one of core eigenvalue solvers in PYFEMax, NGSolve, and MFEM.

Denoising

Iterative LOBPCG-based approximate low-pass filter can be used for denoising; see,^[10] e.g., to accelerate total variation denoising.

Image Segmentation

Hypre implementation of LOBPCG with multigrid preconditioning has been applied to image segmentation in ^[11] via spectral graph partitioning using the graph Laplacian for the bilateral filter.

Data Mining

Software package Megaman uses LOBPCG to scale manifold learning algorithms to large data sets.^[12] NVIDIA has implemented^[13] LOBPCG in its nvGRAPH library introduced in CUDA 8.

References

^ Knyazev, Andrew V. (2001). "Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method". SIAM Journal on Scientific Computing. 23 (2): 517. doi:10.1137/S1064827500366124.
^ Knyazev, A. V.; Argentati, M. E.; Lashuk, I.; Ovtchinnikov, E. E. (2007). "Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) in Hypre and PETSc". SIAM Journal on Scientific Computing. 29 (5): 2224. arXiv:0705.2626. doi:10.1137/060661624.
^ Anzt, Hartwig; Tomov, Stanimir; Dongarra, Jack (2015). "Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product". Proceedings of the Symposium on High Performance Computing (HPC '15). Society for Computer Simulation International, San Diego, CA, USA: 75–82.
^ Yamada, S.; Imamura, T.; Machida, M. (2005). 16.447 TFlops and 159-Billion-dimensional Exact-diagonalization for Trapped Fermion-Hubbard Model on the Earth Simulator. Proc. ACM/IEEE Conference on Supercomputing (SC'05). p. 44. doi:10.1109/SC.2005.1. ISBN 1-59593-061-2.
^ Yamada, S.; Imamura, T.; Kano, T.; Machida, M. (2006). Gordon Bell finalists I—High-performance computing for exact numerical approaches to quantum many-body problems on the earth simulator. Proc. ACM/IEEE conference on Supercomputing (SC '06). p. 47. doi:10.1145/1188455.1188504. ISBN 0769527000.
^ Rakhuba, Maxim; Oseledets, Ivan (2016). "Calculating vibrational spectra of molecules using tensor train decomposition". J. Chem. Phys. 145 (145): 124101. arXiv:1605.08422. Bibcode:2016JChPh.145l4101R. doi:10.1063/1.4962420.
^ Takano, Yu; Nakata, Kazuto; Yonezawa, Yasushige; Nakamura, Haruki (2016). "Development of massive multilevel molecular dynamics simulation program, platypus (PLATform for dYnamic protein unified simulation), for the elucidation of protein functions". J. Comput. Chem. 37 (12): 1125–1132. doi:10.1002/jcc.24318. PMC 4825406.
^ Shao, Meiyue; et al. (2016). "Accelerating Nuclear Configuration Interaction Calculations through a Preconditioned Block Iterative Eigensolver". arXiv:1609.01689 [cs.NA].
^ Yamada, S.; Imamura, T.; Machida, M. (2018). High Performance LOBPCG Method for Solving Multiple Eigenvalues of Hubbard Model: Efficiency of Communication Avoiding Neumann Expansion Preconditioner. Asian Conference on Supercomputing Frontiers. Yokota R., Wu W. (eds) Supercomputing Frontiers. SCFA 2018. Lecture Notes in Computer Science, vol 10776. Springer, Cham. pp. 243–256. doi:10.1007/978-3-319-69953-0_14. {{cite conference}}: Cite has empty unknown parameter: |1= (help)
^ Knyazev, A.; Malyshev, A. (2015). Accelerated graph-based spectral polynomial filters. 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), Boston, MA. pp. 1–6. doi:10.1109/MLSP.2015.7324315.
^ Knyazev, Andrew V. (2003). Boley; Dhillon; Ghosh; Kogan (eds.). Modern preconditioned eigensolvers for spectral image segmentation and graph bisection (PDF). Clustering Large Data Sets; Third IEEE International Conference on Data Mining (ICDM 2003) Melbourne, Florida: IEEE Computer Society. pp. 59–62.
^ McQueen, James; et al. (2016). "Megaman: Scalable Manifold Learning in Python". Journal of Machine Learning Research. 17 (148): 1–5.
^ Naumov, Maxim (2016). "Fast Spectral Graph Partitioning on GPUs". NVIDIA Developer Blog.

External links

LOBPCG code in MATLAB
LOBPCG code in Octave
LOBPCG code in SciPy
LOBPCG code in Java at Google Code
LOBPCG in Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) at Bitbucket
LOBPCG in Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) at Google Code (retiring)

[1] Knyazev, Andrew V. (2001). "Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method". SIAM Journal on Scientific Computing. 23 (2): 517. doi:10.1137/S1064827500366124.

[2] Knyazev, A. V.; Argentati, M. E.; Lashuk, I.; Ovtchinnikov, E. E. (2007). "Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) in Hypre and PETSc". SIAM Journal on Scientific Computing. 29 (5): 2224. arXiv:0705.2626. doi:10.1137/060661624.

[3] Anzt, Hartwig; Tomov, Stanimir; Dongarra, Jack (2015). "Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product". Proceedings of the Symposium on High Performance Computing (HPC '15). Society for Computer Simulation International, San Diego, CA, USA: 75–82.

[4] Yamada, S.; Imamura, T.; Machida, M. (2005). 16.447 TFlops and 159-Billion-dimensional Exact-diagonalization for Trapped Fermion-Hubbard Model on the Earth Simulator. Proc. ACM/IEEE Conference on Supercomputing (SC'05). p. 44. doi:10.1109/SC.2005.1. ISBN 1-59593-061-2.

[5] Yamada, S.; Imamura, T.; Kano, T.; Machida, M. (2006). Gordon Bell finalists I—High-performance computing for exact numerical approaches to quantum many-body problems on the earth simulator. Proc. ACM/IEEE conference on Supercomputing (SC '06). p. 47. doi:10.1145/1188455.1188504. ISBN 0769527000.

[6] Rakhuba, Maxim; Oseledets, Ivan (2016). "Calculating vibrational spectra of molecules using tensor train decomposition". J. Chem. Phys. 145 (145): 124101. arXiv:1605.08422. Bibcode:2016JChPh.145l4101R. doi:10.1063/1.4962420.

[7] Takano, Yu; Nakata, Kazuto; Yonezawa, Yasushige; Nakamura, Haruki (2016). "Development of massive multilevel molecular dynamics simulation program, platypus (PLATform for dYnamic protein unified simulation), for the elucidation of protein functions". J. Comput. Chem. 37 (12): 1125–1132. doi:10.1002/jcc.24318. PMC 4825406.

[8] Shao, Meiyue; et al. (2016). "Accelerating Nuclear Configuration Interaction Calculations through a Preconditioned Block Iterative Eigensolver". arXiv:1609.01689 [cs.NA].

[9] Yamada, S.; Imamura, T.; Machida, M. (2018). High Performance LOBPCG Method for Solving Multiple Eigenvalues of Hubbard Model: Efficiency of Communication Avoiding Neumann Expansion Preconditioner. Asian Conference on Supercomputing Frontiers. Yokota R., Wu W. (eds) Supercomputing Frontiers. SCFA 2018. Lecture Notes in Computer Science, vol 10776. Springer, Cham. pp. 243–256. doi:10.1007/978-3-319-69953-0_14. {{cite conference}}: Cite has empty unknown parameter: |1= (help)

[10] Knyazev, A.; Malyshev, A. (2015). Accelerated graph-based spectral polynomial filters. 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), Boston, MA. pp. 1–6. doi:10.1109/MLSP.2015.7324315.

[11] Knyazev, Andrew V. (2003). Boley; Dhillon; Ghosh; Kogan (eds.). Modern preconditioned eigensolvers for spectral image segmentation and graph bisection (PDF). Clustering Large Data Sets; Third IEEE International Conference on Data Mining (ICDM 2003) Melbourne, Florida: IEEE Computer Society. pp. 59–62.

[12] McQueen, James; et al. (2016). "Megaman: Scalable Manifold Learning in Python". Journal of Machine Learning Research. 17 (148): 1–5.

[13] Naumov, Maxim (2016). "Fast Spectral Graph Partitioning on GPUs". NVIDIA Developer Blog.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

@@ Line 54: / Line 54: @@
 ==Applications==
 ===[[Material Sciences]]===
 LOBPCG is implemented in [[ABINIT]] (including [[CUDA]] version) and [[Octopus (software)|Octopus]]. It has been used for multi-billion size matrices by [[Gordon Bell Prize]] finalists, on the [[Earth Simulator]] [[supercomputer]] in Japan.<ref>{{Cite conference| doi = 10.1109/SC.2005.1| title = 16.447 TFlops and 159-Billion-dimensional Exact-diagonalization for Trapped Fermion-Hubbard Model on the Earth Simulator| work = Proc. ACM/IEEE Conference on Supercomputing (SC'05)| pages = 44| year = 2005| last1 = Yamada | first1 = S.| last2 = Imamura | first2 = T.| last3 = Machida | first3 = M.| isbn = 1-59593-061-2}}</ref><ref>{{Cite conference| doi = 10.1145/1188455.1188504| title = Gordon Bell finalists I—High-performance computing for exact numerical approaches to quantum many-body problems on the earth simulator| conference = Proc. ACM/IEEE conference on Supercomputing (SC '06)| pages = 47| year = 2006| last1 = Yamada | first1 = S. | last2 = Imamura | first2 = T. | last3 = Kano | first3 = T. | last4 = Machida | first4 = M. | isbn = 0769527000}}</ref> Recent implementations include TTPY,<ref>{{Cite journal | doi = 10.1063/1.4962420| title = Calculating vibrational spectra of molecules using tensor train decomposition| journal =  J. Chem. Phys. | volume = 145| year = 2016| issue = 145| pages = 124101| last1 =  Rakhuba| first1 = Maxim | last2 =  Oseledets | first2 =   Ivan| bibcode = 2016JChPh.145l4101R| arxiv =1605.08422}}</ref> Platypus‐QM,<ref>{{Cite journal | doi = 10.1002/jcc.24318 | title = Development of massive multilevel molecular dynamics simulation program, platypus (PLATform for dYnamic protein unified simulation), for the elucidation of protein functions| journal = J. Comput. Chem.| volume = 37| issue = 12| pages = 1125–1132| year = 2016| last1 =  Takano| first1 = Yu | last2 =  Nakata | first2 =  Kazuto| last3 =  Yonezawa | first3 =  Yasushige| last4 =  Nakamura | first4 =  Haruki| pmc =4825406}}</ref> and MFDn.<ref>{{Cite arxiv |eprint=1609.01689 | title = Accelerating Nuclear Configuration Interaction Calculations through a Preconditioned Block Iterative Eigensolver|class=cs.NA | year = 2016| last1 =  Shao| first1 = Meiyue | display-authors =  etal}}</ref>
+[[Hubbard model]] for strongly-correlated electron systems to understand the mechanism behind the [[superconductivity]] uses LOBPCG to calculate the [[ground state]] of the [[Hamiltonian]] on the [[K computer]]. <ref>{{Cite conference| doi = 10.1007/978-3-319-69953-0_14 | conference = Asian Conference on Supercomputing Frontiers | title = High Performance LOBPCG Method for Solving Multiple Eigenvalues of Hubbard Model: Efficiency of Communication Avoiding Neumann Expansion Preconditioner | | work = Yokota R., Wu W. (eds) Supercomputing Frontiers. SCFA 2018. Lecture Notes in Computer Science, vol 10776. Springer, Cham| pages =  243-256| year = 2018| last1 = Yamada | first1 = S.| last2 = Imamura | first2 = T.| last3 = Machida | first3 = M.}}</ref>
 ===[[Maxwell's equations]]===

v t e Numerical linear algebra
Key concepts	Floating point Numerical stability
Problems	System of linear equations Matrix decompositions Matrix multiplication (algorithms) Matrix splitting Sparse problems
Hardware	CPU cache TLB Cache-oblivious algorithm SIMD Multiprocessing
Software	ATLAS MATLAB Basic Linear Algebra Subprograms (BLAS) LAPACK Specialized libraries General purpose software