Change of basis

In linear algebra, a basis for a vector space of dimension n is a sequence of n vectors α₁, ..., α_n with the property that every vector in the space can be expressed uniquely as a linear combination of the basis vectors. Since it is often desirable to work with more than one basis for a vector space, it is of fundamental importance in linear algebra to be able to easily transform coordinate-wise representations of vectors and linear transformations taken with respect to one basis to their equivalent representations with respect to another basis. Such a transformation is called a change of basis.

Although the terminology of vector spaces is used below and the symbol R can be taken to mean the field of real numbers, the results discussed hold whenever R is a commutative ring and vector space is everywhere replaced with free R-module.

Preliminary notions

The usual basis for Rⁿ is {e₁, ..., e_n}, where e_j = (0, ..., 1, 0, ..., 0) is the element of Rⁿ with 1 in the j-th place and 0s elsewhere.

If T : Rⁿ → R^m is a linear transformation, the m × n matrix of T is the matrix t whose j-th column is T(e_j) for j = 1, ..., n. In this case we have T(x) = tx for all x in Rⁿ, where we regard x as a column vector and the multiplication on the right side is matrix multiplication. It is a basic fact in linear algebra that the vector space Hom(Rⁿ, R^m) of all linear transformations from Rⁿ to R^m is naturally isomorphic to the space R^{m × n} of m × n matrices over R; that is, a linear transformation T : Rⁿ → R^m is for all intents and purposes equivalent to its matrix t.

We will also make use of the following simple observation.

Theorem Let V and W be vector spaces, let {α₁, ..., α_n} be a basis for V, and let {γ₁, ..., γ_n} be any n vectors in W. Then there exists a unique linear transformation T : V → W with T(α_j) = γ_j for j = 1, ..., n.

This unique T is defined by T(x₁α₁ + ... + x_nα_n) = x₁γ₁ + ... + x_nγ_n. Of course, if {γ₁, ..., γ_n} happens to be a basis for W, then T is bijective as well as linear; in other words, T is an isomorphism. If in this case we also have W = V, then T is said to be an automorphism.

Now let V be a vector space over R and suppose {α₁, ..., α_n} is a basis for V. By definition, if ξ is a vector in V then ξ = x₁α₁ + ... + x_nα_n for a unique choice of scalars x₁, ..., x_n in R called the coordinates of ξ relative to the ordered basis {α₁, ..., α_n}. The vector x = (x₁, ..., x_n) in Rⁿ is called the coordinate tuple of ξ (relative to this basis). The unique linear map φ : Rⁿ → V with φ(e_j) = α_j for j = 1, ..., n is called the coordinate isomorphism for V and the basis {α₁, ..., α_n}. Thus φ(x) = ξ if and only if ξ = x₁α₁ + ... + x_nα_n.

Change of coordinates

First we examine the question of how the coordinates of a vector ξ in V change when we select another basis. Suppose {α₁, ..., α_n} and {α'₁, ..., α'_n} are two ordered bases for V. Let φ₁ and φ₂ be the corresponding coordinate isomorphisms from Rⁿ to V, i.e. φ₁(e_j) = α_j and φ₂(e_j) = α'_j for j = 1, ..., n. If x = (x₁, ..., x_n) is the coordinate n-tuple of ξ with respect to the first basis, so that ξ = φ₁(x), then the coordinate tuple of ξ with respect to the second basis is φ₂^-1(ξ) = φ₂^-1(φ₁(x)). Now the map φ₂^-1 o φ₁ is an automorphism on Rⁿ and therefore has a matrix p. Moreover, the j-th column of p is φ₂^-1 o φ₁(e_j) = φ₂^-1(α_j), that is, the coordinate n-tuple of α_j with respect to the second basis {α'₁, ..., α'_n}. Thus y = φ₂^-1(φ₁(x)) = px is the coordinate n-tuple of ξ with respect to the basis {α'₁, ..., α'_n}.

The matrix of a linear transformation

Now suppose T : V → W is a linear transformation, {α₁, ..., α_n} is a basis for V and {β₁, ..., β_m} is a basis for W. Let φ and ψ be the coordinate isomorphisms for V and W, respectively, relative to the given bases. Then the map T₁ = ψ^-1 o T o φ is a linear transformation from Rⁿ to R^m, and therefore has a matrix t; its j-th column is ψ^-1(T(α_j)) for j = 1, ..., n. This matrix is called the matrix of T with respect to the ordered bases {α₁, ..., α_n} and {β₁, ..., β_m}. If η = T(ξ) and y and x are the coordinate tuples of η and ξ, then y = ψ^-1(T(φ(x))) = tx. Conversely, if ξ is in V and x = φ^-1(ξ) is the coordinate tuple of ξ with respect to {α₁, ..., α_n}, and we set y = tx and η = ψ(y), then η = ψ(T₁(x)) = T(ξ). That is, if ξ is in V and η is in W and x and y are their coordinate tuples, then y = tx if and only if η = T(ξ).

Theorem Suppose U, V and W are vector spaces of finite dimension and an ordered basis is chosen for each. If T : U → V and S : V → W are linear transformations with matrices s and t, then the matrix of the linear transformation S o T : U → W (with respect to the given bases) is st.

Change of basis

Now we ask what happens to the matrix of T : V → W when we change bases in V and W. Let {α₁, ..., α_n} and {β₁, ..., β_m} be ordered bases for V and W respectively, and suppose we are given a second pair of bases {α'₁, ..., α'_n} and {β'₁, ..., β'_m}. Let φ₁ and φ₂ be the coordinate isomorphisms taking the usual basis in Rⁿ to the first and second bases for V, and let ψ₁ and ψ₂ be the isomorphisms taking the usual basis in R^m to the first and second bases for W.

Let T₁ = ψ₁^-1 o T o φ₁, and T₂ = ψ₂^-1 o T o φ₂ (both maps taking Rⁿ to R^m), and let t₁ and t₂ be their respective matrices. Let p and q be the matrices of the change-of-coordinates automorphisms φ₂^-1 o φ₁ on Rⁿ and ψ₂^-1 o ψ₁ on R^m.

The relationships of these various maps to one another are illustrated in the following commutative diagram.

(insert standard change-of-basis diagram)

Since we have T₂ = ψ₂^-1 o T o φ₂ = (ψ₂^-1 o ψ₁) o T₁ o (φ₁^-1 o φ₂), and since composition of linear maps corresponds to matrix multiplication, it follows that

t₂ = q t₁ p^-1.

If it happens to be the case that W = V so that we can naturally take {β₁, ..., β_n} = {α₁, ..., α_n} and {β'₁, ..., β'_m} = {α'₁, ..., α'_n}, then this expression becomes

t₂ = p t₁ p^-1.

In this situation the invertible matrix p is called a change-of-basis matrix for the vector space V, and the equation above says that the matrices t₁ and t₂ are similar.

The matrix of a bilinear form

A bilinear form on a vector space V over a field R is a mapping V × V → R which is linear in both arguments. That is, B : V × V → R is bilinear if the maps

v\mapsto B(v,w)

v\mapsto B(w,v)

are linear for each w in V. This definition applies equally well to modules over a commutative ring with linear maps being module homomorphisms.

The Gram matrix G attached to a basis $\alpha _{1},\dots ,\alpha _{n}$ is defined by

G_{i,j}=B(\alpha _{i},\alpha _{j})\,

.

If $v=\sum _{i}x_{i}\alpha _{i}$ and $w=\sum _{i}y_{i}\alpha _{i}$ are the expressions of vectors v, w with respect to this basis, then the bilinear form is given by

B(v,w)=x^{\top }Gy\,

.

The matrix will be symmetric if the bilinear form B is a symmetric bilinear form.

Change of basis

If P is the invertible matrix representing a change of basis from $\alpha _{1},\dots ,\alpha _{n}$ to $\alpha '_{1},\dots ,\alpha '_{n}$ then the Gram matrix transforms by the matrix congruence

G'=P^{\top }GP\,

.

Example from mechanics

(this example to be replaced or amended)

Let's say we have a train rolling on rails. Using a cartesian coordinate system, let the rail be headed straight in the X-direction (east, on most maps). Now, if we push the train in the X-direction, it will move, but if we try to push the train in the Y-direction (north), it won't be able to move (without derailing).

We could formulate this as a matrix, where the first column shows acceleration of the train when pushed in the X-direction =(1,0), and the second column shows the acceleration of the train when pushed in the Y-direction =(0,0):

A={\begin{pmatrix}1&0\\0&0\end{pmatrix}}

Now, let's say that we want the rail to be headed in northeasterly direction (45 degrees on the compass).

How should our matrix look now? If we push the train in the X-direction, it should move a little in the X-direction, but it should also move in the Y-direction.

A basis change lets us find the matrix B which describes the movement of the train on a northeasterly rail.

All we need to do is to change basis to a basis where the X-axis is in the direction of the rails, multiply with our matrix A, and then change back to the original basis.

A rotation matrix for a 45 degree rotation looks like this:

R={\begin{pmatrix}1/{\sqrt {(}}2)&-1/{\sqrt {(}}2)\\1/{\sqrt {(}}2)&1/{\sqrt {(}}2)\end{pmatrix}}

Let the direction we're pushing the train in be P. Putting P in our new basis:

R^{-1}P

Applying our matrix A:

A(R^{-1}P)

Changing back to the original basis:

R(A(R^{-1}P))

And using the matrix multiplication laws, we can remove the parenthesis:

RAR^{-1}P

And identify the matrix we were looking for:

RAR^{-1}

And by remembering that the inverse of a rotation matrix is simpy its transpose (this step isn't really necessary, but transpose is a quicker operation to do by hand than finding the inverse :-) ), our final answer is:

B=RAR^{T}={\begin{pmatrix}1/2&1/2\\1/2&1/2\end{pmatrix}}

We can now see (by looking at the first column of the matrix) that if we push the train in the X-direction, it will move in the direction of the rail .

If we try to push in the north-westerly direction, the train will not move:

BP={\begin{pmatrix}1/2&1/2\\1/2&1/2\end{pmatrix}}\times {\begin{pmatrix}-1/2\\1/2\end{pmatrix}}={\begin{pmatrix}0\\0\end{pmatrix}}

So the matrix we found seems to do the trick.

External links

MIT Linear Algebra Lecture on Change of Bases by GlobalVersity at Google Video, from MIT OpenCourseWare