Gram–Schmidt process

From Wikipedia, the free encyclopedia

  (Redirected from Gram-Schmidt orthogonalization)
Jump to: navigation, search

In mathematics, particularly linear algebra and numerical analysis, the Gram–Schmidt process is a method for orthogonalizing a set of vectors in an inner product space, most commonly the Euclidean space Rn. The Gram–Schmidt process takes a finite, linearly independent set S = {v1, …, vk} for kn and generates an orthogonal set S' = {u1, …, uk} that spans the same k-dimensional subspace of Rn as S.

The method is named for Jørgen Pedersen Gram and Erhard Schmidt but it appeared earlier in the work of Laplace and Cauchy. In the theory of Lie group decompositions it is generalized by the Iwasawa decomposition.

The application of the Gram–Schmidt process to the column vectors of a full column rank matrix yields the QR decomposition (it is decomposed into an orthogonal and a triangular matrix).

Contents

[edit] The Gram–Schmidt process

We define the projection operator by

\mathrm{proj}_{\mathbf{u}}\,(\mathbf{v}) = {\langle \mathbf{u}, \mathbf{v}\rangle\over\langle \mathbf{u}, \mathbf{u}\rangle}\mathbf{u} = {\langle \mathbf{u}, \mathbf{v}\rangle} {\mathbf{u}\over\langle \mathbf{u}, \mathbf{u}\rangle},

where <u, v> denotes the inner product of the vectors u and v. This operator projects the vector v orthogonally onto the vector u.

The Gram–Schmidt process then works as follows:

\mathbf{u}_1 = \mathbf{v}_1, \mathbf{e}_1 = {\mathbf{u}_1 \over \|\mathbf{u}_1\|}
\mathbf{u}_2 = \mathbf{v}_2-\mathrm{proj}_{\mathbf{u}_1}\,(\mathbf{v}_2), \mathbf{e}_2 = {\mathbf{u}_2 \over \|\mathbf{u}_2\|}
\mathbf{u}_3 = \mathbf{v}_3-\mathrm{proj}_{\mathbf{u}_1}\,(\mathbf{v}_3)-\mathrm{proj}_{\mathbf{u}_2}\,(\mathbf{v}_3), \mathbf{e}_3 = {\mathbf{u}_3 \over \|\mathbf{u}_3\|}
\mathbf{u}_4 = \mathbf{v}_4-\mathrm{proj}_{\mathbf{u}_1}\,(\mathbf{v}_4)-\mathrm{proj}_{\mathbf{u}_2}\,(\mathbf{v}_4)-\mathrm{proj}_{\mathbf{u}_3}\,(\mathbf{v}_4), \mathbf{e}_4 = {\mathbf{u}_4 \over \|\mathbf{u}_4\|}
\vdots \vdots
\mathbf{u}_k = \mathbf{v}_k-\sum_{j=1}^{k-1}\mathrm{proj}_{\mathbf{u}_j}\,(\mathbf{v}_k), \mathbf{e}_k = {\mathbf{u}_k\over \|\mathbf{u}_k \|}
The first two steps of the Gram–Schmidt process.

The sequence u1, …, uk is the required system of orthogonal vectors, and the normalized vectors e1, …, ek form an orthonormal set. The calculation of the sequence u1, …, uk is known as Gram–Schmidt orthogonalization, whilst the calculation of the sequence e1, …, ek is known as Gram–Schmidt orthonormalization as the vectors are normalized.

To check that these formulas yield an orthogonal sequence, first compute 〈u1, u2〉 by substituting the above formula for u2: we get zero. Then use this to compute 〈u1, u3〉 again by substituting the formula for u3: we get zero. The general proof proceeds by mathematical induction.

Geometrically, this method proceeds as follows: to compute ui, it projects vi orthogonally onto the subspace U generated by u1, …, ui−1, which is the same as the subspace generated by v1, …, vi−1. The vector ui is then defined to be the difference between vi and this projection, guaranteed to be orthogonal to all of the vectors in the subspace U.

The Gram–Schmidt process also applies to a linearly independent infinite sequence {vi}i. The result is an orthogonal (or orthonormal) sequence {ui}i such that for natural number n: the algebraic span of v1, …, vn is the same as that of u1, …, un.

If the Gram–Schmidt process is applied to a linearly dependent sequence, it outputs the 0 vector on the ith step, assuming that \mathbf{v_i} is a linear combination of \mathbf{v_1}, \mathbf{v_2}, \ldots, \mathbf{v_{i-1}}. If this can happen, then in order to meet the condition that the outputs be orthogonal, and furthermore to avoid division by zero if producing an orthonormal basis, the algorithm should test for zero vectors in the output and discard them. The number of vectors output by the algorithm will then be the dimension of the space spanned by the original inputs.

[edit] Example

Consider the following set of vectors in R2 (with the conventional inner product)

S = \left\lbrace\mathbf{v}_1=\begin{pmatrix} 3 \\ 1\end{pmatrix}, \mathbf{v}_2=\begin{pmatrix}2 \\2\end{pmatrix}\right\rbrace.

Now, perform Gram–Schmidt, to obtain an orthogonal set of vectors:

\mathbf{u}_1=\mathbf{v}_1=\begin{pmatrix}3\\1\end{pmatrix}
 \mathbf{u}_2 = \mathbf{v}_2 - \mathrm{proj}_{\mathbf{u}_1} \, (\mathbf{v}_2) = \begin{pmatrix}2\\2\end{pmatrix} - \mathrm{proj}_{({3 \atop 1})} \, ({\begin{pmatrix}2\\2\end{pmatrix})} = \begin{pmatrix} -2/5 \\6/5 \end{pmatrix}.

We check that the vectors u1 and u2 are indeed orthogonal:

\langle\mathbf{u}_1,\mathbf{u}_2\rangle = \left\langle \begin{pmatrix}3\\1\end{pmatrix}, \begin{pmatrix}-2/5\\6/5\end{pmatrix} \right\rangle = -\frac65 + \frac65 = 0.

We can then normalize the vectors by dividing out their sizes as shown above:

\mathbf{e}_1 = {1 \over \sqrt {10}}\begin{pmatrix}3\\1\end{pmatrix}
\mathbf{e}_2 = {1 \over \sqrt{40 \over 25}} \begin{pmatrix}-2/5\\6/5\end{pmatrix}
 = {1\over\sqrt{10}} \begin{pmatrix}-1\\3\end{pmatrix}.

[edit] Numerical stability

When this process is implemented on a computer, the vectors uk are often not quite orthogonal, due to rounding errors. For the Gram–Schmidt process as described above (sometimes referred to as "classical Gram–Schmidt") this loss of orthogonality is particularly bad; therefore, it is said that the (classical) Gram–Schmidt process is numerically unstable.

The Gram–Schmidt process can be stabilized by a small modification. Instead of computing the vector uk as

 \mathbf{u}_k = \mathbf{v}_k - \mathrm{proj}_{\mathbf{u}_1}\,(\mathbf{v}_k) - \mathrm{proj}_{\mathbf{u}_2}\,(\mathbf{v}_k) - \cdots - \mathrm{proj}_{\mathbf{u}_{k-1}}\,(\mathbf{v}_k),

it is computed as

 \begin{align}
\mathbf{u}_k^{(1)} &= \mathbf{v}_k - \mathrm{proj}_{\mathbf{u}_1}\,(\mathbf{v}_k), \\
\mathbf{u}_k^{(2)} &= \mathbf{u}_k^{(1)} - \mathrm{proj}_{\mathbf{u}_2} \, (\mathbf{u}_k^{(1)}), \\
& \,\,\, \vdots \\
\mathbf{u}_k^{(k-2)} &= \mathbf{u}_k^{(k-3)} - \mathrm{proj}_{\mathbf{u}_{k-2}} \, (\mathbf{u}_k^{(k-3)}), \\
\mathbf{u}_k &= \mathbf{u}_k^{(k-2)} - \mathrm{proj}_{\mathbf{u}_{k-1}} \, (\mathbf{u}_k^{(k-2)}). 
\end{align}

This approach (sometimes referred to as "modified Gram–Schmidt") gives the same result as the original formula in exact arithmetic, but it introduces smaller errors in finite-precision arithmetic.

[edit] Algorithm

The following algorithm implements the stabilized Gram–Schmidt orthonormalization. The vectors v1, …, vk are replaced by orthonormal vectors which span the same subspace.

for j from 1 to k do
for i from 1 to j − 1 do
 \mathbf{v}_j \leftarrow \mathbf{v}_j - \mathrm{proj}_{\mathbf{v}_{i}} \, \mathbf{v}_j (remove component in direction vi)
next i
 \mathbf{v}_j \leftarrow \frac{\mathbf{v}_j}{\|\mathbf{v}_j\|} (normalize)
next j

The cost of this algorithm is asymptotically 2nk2 floating point operations, where n is the dimensionality of the vectors (Golub & Van Loan 1996, §5.2.8).

[edit] Alternatives

Other orthogonalization algorithms use Householder transformations or Givens rotations. The algorithms using Householder transformations are more stable than the stabilized Gram–Schmidt process. On the other hand, the Gram–Schmidt process produces the jth orthogonalized vector after the jth iteration, while orthogonalization using Householder reflections produces all the vectors only at the end. This makes only the Gram–Schmidt process applicable for iterative methods like the Arnoldi iteration.

[edit] References

[edit] External links

Personal tools