Min-max theorem

From Wikipedia, the free encyclopedia
Jump to: navigation, search
"Variational theorem" redirects here. It is not to be confused with variational principle.

In linear algebra and functional analysis, the min-max theorem, or variational theorem, or Courant–Fischer–Weyl min-max principle, is a result that gives a variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces. It can be viewed as the starting point of many results of similar nature.

This article first discusses the finite-dimensional case and its applications before considering compact operators on infinite-dimensional Hilbert spaces. We will see that for compact operators, the proof of the main theorem uses essentially the same idea from the finite-dimensional argument.

In the case that the operator is non-Hermitian, the theorem provides an equivalent characterization of the associated singular values. The min-max theorem can be extended to self-adjoint operators that are bounded below.

Matrices[edit]

Let A be a n × n Hermitian matrix. As with many other variational results on eigenvalues, one considers the Rayleigh–Ritz quotient RA : Cn \ {0} → R defined by

R_A(x) = \frac{(Ax, x)}{(x,x)}

where (⋅, ⋅) denotes the Euclidean inner product on Cn. Clearly, the Rayleigh quotient of an eigenvector is its associated eigenvalue. Equivalently, the Rayleigh–Ritz quotient can be replaced by

f(x) = (Ax, x), \; \|x\| = 1.

For Hermitian matrices, the range of the continuous function RA(x), or f(x), is a compact subset [a, b] of the real line. The maximum b and the minimum a are the largest and smallest eigenvalue of A, respectively. The min-max theorem is a refinement of this fact.

Min-max Theorem[edit]

Let A be a n × n Hermitian matrix with eigenvalues λ1 ≥ ... ≥ λk ≥ ... ≥ λn then

\lambda_k = \max \{ \min \{ R_A(x) \mid x \in U \text{ and } x \neq 0 \} \mid \dim(U)=k \}

and

\lambda_k = \min \{ \max \{ R_A(x) \mid x \in U \text{ and } x \neq 0 \} \mid \dim(U)=n-k+1 \}

in particular,

\lambda_n \leq R_A(x) \leq \lambda_1 \quad\forall x \in \mathbf{C}^n\backslash\{0\}

and these bounds are attained when x is an eigenvector of the appropriate eigenvalues.

Also note the simpler formulation for the maximal eigenvalue λ1 is given by:

\max \{R_A(x) : x \neq 0 \} = \lambda_1.

Proof[edit]

Since the matrix A is Hermitian it is diagonalizable and we can choose an orthonormal basis of eigenvectors {u1, ..., un} that is, ui is an eigenvector for the eigenvalue λi and such that (ui, ui) = 1 and (ui, uj) = 0 for all ij.

If U is a subspace of dimension k then its intersection with the subspace span{uk, ..., un} isn't zero (by simply checking dimensions) and hence there exists a vector v ≠ 0 in this intersection that we can write as

v = \sum_{i=k}^n \alpha_i u_i

and whose Rayleigh quotient is

R_A(v) = \frac{\sum_{i=k}^n \lambda_i \alpha_i^2}{\sum_{i=k}^n \alpha_i^2} \leq \lambda_k

and hence

\min \{ R_A(x) \mid x \in U \} \leq \lambda_k

And we can conclude that

\max \{ \min \{ R_A(x) \mid x \in U \text{ and } x \neq 0 \} \mid \dim(U)=k \} \leq \lambda_k

And since that maximum value is achieved for U = span{u1, ..., uk} we can conclude the equality.

In the case where U is a subspace of dimension n-k+1, we proceed in a similar fashion: Consider the subspace of dimension k, span{u1, ..., uk}. Its intersection with the subspace U isn't zero (by simply checking dimensions) and hence there exists a vector v in this intersection that we can write as

v = \sum_{i=1}^k \alpha_i u_i

and whose Rayleigh quotient is

R_A(v) = \frac{\sum_{i=1}^k \lambda_i \alpha_i^2}{\sum_{i=1}^k \alpha_i^2} \geq \lambda_k

and hence

\max \{ R_A(x) \mid x \in U \} \geq \lambda_k

And we can conclude that

\min \{ \max \{ R_A(x) \mid x \in U \text{ and } x \neq 0 \} \mid \dim(U)=n-k+1 \} \geq \lambda_k

And since that minimum value is achieved for U = span{uk, ..., un} we can conclude the equality.

Counterexample in the non-Hermitian case[edit]

Let N be the nilpotent matrix

\begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}.

Define the Rayleigh quotient  R_N(x) exacltly as above in the Hermitian case. Then it is easy to see that the only eigenvalue of N is zero, while the maximum value of the Rayleigh ratio is 1/2. That is, the maximum value of the Rayleigh quotient is larger than the maximum eigenvalue.

Applications[edit]

Min-max principle for singular values[edit]

The singular values {σk} of a square matrix M are the square roots of eigenvalues of M*M (equivalently MM*). An immediate consequence[citation needed] of the first equality from min-max theorem is

\sigma_k ^{\uparrow} = \min_{S:\dim(S)=k} \max_{x \in S, \|x\| = 1} (M^* Mx, x)^{\frac{1}{2}}=\min_{S:\dim(S)=k} \max_{x \in S, \|x\| = 1} \| Mx \|.

Similarly,

\sigma_k ^{\downarrow} = \max_{S:\dim(S)=n-k+1} \min_{x \in S, \|x\| = 1} \| Mx \|.

Cauchy interlacing theorem[edit]

Let A be a symmetric n × n matrix. The m × m matrix B, where mn, is called a compression of A if there exists an orthogonal projection P onto a subspace of dimension m such that P*AP = B. The Cauchy interlacing theorem states:

Theorem. If the eigenvalues of A are α1 ≤ ... ≤ αn, and those of B are β1 ≤ ... ≤ βj ≤ ... ≤ βm, then for all j < m + 1,
\alpha_j \leq \beta_j \leq \alpha_{n-m+j}.

This can be proven using the min-max principle. Let βi have corresponding eigenvector bi and Sj be the j dimensional subspace Sj = span{b1, ..., bj}, then

\beta_j = \max_{x \in S_j, \|x\| = 1} (Bx, x) = \max_{x \in S_j, \|x\| = 1} (P^*APx, x) \geq \min_{S_j} \max_{x \in S_j, \|x\| = 1} (Ax, x) = \alpha_j.

According to first part of min-max, αjβj. On the other hand, if we define Smj+1 = span{bj, ..., bm}, then

\beta_j = \min_{x \in S_{m-j+1}, \|x\| = 1} (Bx, x) = \min_{x \in S_{m-j+1}, \|x\| = 1} (P^*APx, x)= \min_{x \in S_{m-j+1}, \|x\| = 1} (Ax, x) \leq \alpha_{n-m+j},

where the last inequality is given by the second part of min-max.

Notice that, when nm = 1, we have αjβjαj+1, hence the name interlacing theorem.

Compact operators[edit]

Let A be a compact, Hermitian operator on a Hilbert space H. Recall that the spectrum of such an operator form a sequence of real numbers whose only possible cluster point is zero. Every nonzero number in the spectrum is an eigenvalue. It no longer makes sense here to list the positive eigenvalues in increasing order. Let the positive eigenvalues of A be

\cdots \le \lambda_k \le \cdots \le \lambda_1,

where multiplicity is taken into account as in the matrix case. When H is infinite-dimensional, the above sequence of eigenvalues is necessarily infinite. We now apply the same reasoning as in the matrix case. Let SkH be a k dimensional subspace, and S' be the closure of the linear span S' = span{ukuk+1, ...}. The subspace S' has codimension k − 1. By the same dimension count argument as in the matrix case, S'Sk is non empty. So there exists xS' Sk with ||x|| = 1. Since it is an element of S' , such an x necessarily satisfy

(Ax, x) \le \lambda_k.

Therefore, for all Sk

\inf_{x \in S_k, \|x\| = 1}(Ax,x) \le \lambda_k

But A is compact, therefore the function f(x) = (Ax, x) is weakly continuous. Furthermore, any bounded set in H is weakly compact. This lets us replace the infimum by minimum:

\min_{x \in S_k, \|x\| = 1}(Ax,x) \le \lambda_k.

So

\sup_{S_k} \min_{x \in S_k, \|x\| = 1}(Ax,x) \le \lambda_k.

Because equality is achieved when Sk = span{u1, ..., uk},

\max_{S_k} \min_{x \in S_k, \|x\| = 1}(Ax,x) = \lambda_k.

This is the first part of min-max theorem for compact self-adjoint operators.

Analogously, consider now a (k − 1)-dimensional subspace Sk−1, whose the orthogonal compliment is denoted by Sk−1. If S' = span{u1...uk},

S' \cap S_{k-1}^{\perp} \ne {0}.

So

\exists x \in S_{k-1}^{\perp} \, \|x\| = 1, (Ax, x) \ge \lambda_k.

This implies

\max_{x \in S_{k-1}^{\perp}, \|x\| = 1} (Ax, x) \ge \lambda_k

where the compactness of A was applied. Index the above by the collection of (k − 1)-dimensional subspaces gives

\inf_{S_{k-1}} \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) \ge \lambda_k.

Pick Sk−1 = span{u1, ..., uk−1} and we deduce

\min_{S_{k-1}} \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) = \lambda_k.

In summary,

Theorem (Min-Max). Let A be a compact, self-adjoint operator on a Hilbert space H, whose positive eigenvalues are listed in decreasing order ... ≤ λk ≤ ... ≤ λ1. Then:
\begin{align}
\max_{S_k} \min_{x \in S_k, \|x\| = 1} (Ax,x) &= \lambda_k ^{\downarrow}, \\
\min_{S_{k-1}} \max_{x \in S_{k-1}^{\perp}, \|x\|=1} (Ax, x) &= \lambda_k^{\downarrow}.
\end{align}

A similar pair of equalities hold for negative eigenvalues.

See also[edit]

References[edit]

  • M. Reed and B. Simon, Methods of Modern Mathematical Physics IV: Analysis of Operators, Academic Press, 1978.