Rotation matrix

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In linear algebra, a rotation matrix is a matrix that is used to perform a rotation in Euclidean space. For example the matrix

R =
\begin{bmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta \\
\end{bmatrix}

rotates points in the xy-Cartesian plane counter-clockwise through an angle θ about the origin of the Cartesian coordinate system. To perform the rotation using a rotation matrix R, the position of each point must be represented by a column vector v, containing the coordinates of the point. A rotated vector is obtained by using the matrix multiplication Rv.

Rotation matrices also provide a means of numerically representing an arbitrary rotation of the axes about the origin, without appealing to angular specification. These coordinate rotations are a natural way to express the orientation of a camera, or the attitude of a spacecraft, relative to a reference axes-set. Once an observational platform's local X-Y-Z axes are expressed numerically as three direction vectors in world coordinates, they together comprise the columns of the rotation matrix R (world → platform) that transforms directions (expressed in world coordinates) into equivalent directions expressed in platform-local coordinates.

The examples in this article apply to rotation of vectors anti-clockwise in a right-handed system by pre-multiplication. If any one of these is changed (e.g. rotating axes instead of vectors), then the transpose of the example matrix should be used.

Since matrix multiplication has no effect on the zero vector (the coordinates of the origin), rotation matrices can only be used to describe rotations about the origin of the coordinate system. Rotation matrices provide an algebraic description of such rotations, and are used extensively for computations in geometry, physics, and computer graphics.

Rotation matrices are square matrices, with real entries. More specifically they can be characterized as orthogonal matrices with determinant 1,

R^{T} = R^{-1}, \det R = 1\,.

In some literature, the term rotation is generalized to include improper rotations, characterized by orthogonal matrices with determinant −1 (instead of +1). These combine proper rotations with reflections (which invert orientation). In other cases, where reflections are not being considered, the label proper may be dropped. This convention is followed in this article.

The set of all orthogonal matrices of size n with determinant +1 forms a group known as the special orthogonal group SO(n). The set of all orthogonal matrices of size n with determinant +1 or -1 forms the (general) orthogonal group O(n).

In two dimensions[edit]

A counterclockwise rotation of a vector through angle θ. The vector is initially aligned with the x-axis.

In two dimensions, every rotation matrix has the following form,


R(\theta) = \begin{bmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta \\
\end{bmatrix}.

This rotates column vectors by means of the following matrix multiplication,


\begin{bmatrix}
x' \\
y' \\
\end{bmatrix} = \begin{bmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta \\
\end{bmatrix}\begin{bmatrix}
x \\
y \\
\end{bmatrix}.

So the coordinates (x',y') of the point (x,y) after rotation are

x' = x \cos \theta - y \sin \theta\,,
y' = x \sin \theta + y \cos \theta\,.

The direction of vector rotation is counterclockwise if θ is positive (e.g. 90°), and clockwise if θ is negative (e.g. −90°). Thus the clockwise rotation matrix is found as


R(-\theta) = \begin{bmatrix}
\cos \theta & \sin \theta \\
-\sin \theta & \cos \theta \\
\end{bmatrix}\,.

Note that the two-dimensional case is the only non-trivial (e.g. one dimension) case where the rotation matrices group is commutative, so that it does not matter the order in which multiple rotations are performed. An alternative convention uses rotating axes,[1] and the above matrix also represents a rotation of the axes clockwise through an angle θ.

Non-standard orientation of the coordinate system[edit]

A rotation through angle θ with non-standard axes.

If a standard right-handed Cartesian coordinate system is used, with the x axis to the right and the y axis up, the rotation R(θ) is counterclockwise. If a left-handed Cartesian coordinate system is used, with x directed to the right but y directed down, R(θ) is clockwise. Such non-standard orientations are rarely used in mathematics but are common in 2D computer graphics, which often have the origin in the top left corner and the y-axis down the screen or page.[2]

See below for other alternative conventions which may change the sense of the rotation produced by a rotation matrix.

Common rotations[edit]

Particularly useful are the matrices for 90° and 180° rotations,


R(90^\circ) = \begin{bmatrix}
0 & -1 \\[3pt]
1 & 0 \\
\end{bmatrix} (90° counterclockwise rotation)
R(180^\circ) = \begin{bmatrix}
-1 & 0 \\[3pt]
0 & -1 \\
\end{bmatrix} (180° rotation in either direction – a half-turn)
R(270^\circ) = \begin{bmatrix}
0 & 1 \\[3pt]
-1 & 0 \\
\end{bmatrix} (270° counterclockwise rotation, the same as a 90° clockwise rotation)

In three dimensions[edit]

Basic rotations[edit]

A basic rotation (also called elemental rotation) is a rotation about one of the axes of a Coordinate system. The following three basic rotation matrices rotate vectors by an angle θ about the x, y, or z axis, in three dimensions, using the right hand rule. (The same matrices can also represent a clockwise rotation of the axes[3])


\begin{alignat}{1}
R_x(\theta) &= \begin{bmatrix}
1 & 0 & 0 \\
0 & \cos \theta &  -\sin \theta \\[3pt]
0 & \sin \theta  &  \cos \theta \\[3pt]
\end{bmatrix} \\[6pt]
R_y(\theta) &= \begin{bmatrix}
\cos \theta & 0 & \sin \theta \\[3pt]
0 & 1 & 0 \\[3pt]
-\sin \theta & 0 & \cos \theta \\
\end{bmatrix} \\[6pt]
R_z(\theta) &= \begin{bmatrix}
\cos \theta &  -\sin \theta & 0 \\[3pt]
\sin \theta & \cos \theta & 0\\[3pt]
0 & 0 & 1\\
\end{bmatrix}
\end{alignat}

For column vectors, each of these basic vector rotations appears counter-clockwise when the axis about which they occur points toward the observer, the coordinate system is right-handed, and the angle θ is positive. Rz, for instance, would rotate toward the y-axis a vector aligned with the x-axis, as can easily be checked by operating with Rz on the vector (1,0,0):

 R_z(90^\circ) \begin{bmatrix} 1 \\ 0 \\ 0 \\ \end{bmatrix} =
\begin{bmatrix} \cos 90^\circ &  -\sin 90^\circ & 0 \\ \sin 90^\circ & \cos 90^\circ & 0\\ 0 & 0 & 1\\ \end{bmatrix} 
\begin{bmatrix} 1 \\ 0 \\ 0 \\ \end{bmatrix} =
\begin{bmatrix} 0 &  -1 & 0 \\ 1 & 0 & 0\\ 0 & 0 & 1\\ \end{bmatrix} 
\begin{bmatrix} 1 \\ 0 \\ 0 \\ \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ 0 \\ \end{bmatrix}

This is similar to the rotation produced by the above mentioned 2-D rotation matrix. See below for alternative conventions which may apparently or actually invert the sense of the rotation produced by these matrices.

General rotations[edit]

Other rotation matrices can be obtained from these three using matrix multiplication. For example, the product

R = R_z(\alpha) \, R_y(\beta) \, R_x(\gamma)\,\!

represents a rotation whose yaw, pitch, and roll angles are α, β, and γ, respectively. More formally, it is an intrinsic rotation whose Tait-Bryan angles are α, β, γ, about axes z, y, x respectively. Similarly, the product

R = R_y(\gamma) \, R_x(\beta) \, R_y(\alpha)\,\!

represents an extrinsic rotation whose Euler angles are α, β, γ about axes y, x, y.

These matrices produce the desired effect only if they are used to pre-multiply column vectors (see Ambiguities for more details).

Conversion from and to axis-angle[edit]

Every rotation in three dimensions is defined by its axis — a direction that is left fixed by the rotation — and its angle — the amount of rotation about that axis (Euler rotation theorem).

There are several methods to compute an axis and an angle from a rotation matrix (see also axis-angle). Here, we only describe the method based on the computation of the eigenvectors and eigenvalues of the rotation matrix. It is also possible to use the trace of the rotation matrix.

Determining the axis[edit]

A rotation R around axis u can be decomposed using 3 endomorphisms P, (I − P), and Q (click to enlarge).

Given a 3x3 rotation matrix R, a vector u parallel to the rotation axis must satisfy

R\textbf{u} = \textbf{u}~,

since the rotation of u around the rotation axis must result in u. The equation above may be solved for u which is unique up to a scalar factor.

Further, the equation may be rewritten

R\textbf{u} = I \textbf{u} \quad \Rightarrow \quad (R - I) \textbf{u} = 0~,

which shows that u is the null space of R − I.

Viewed in another way, u is an eigenvector of R corresponding to the eigenvalue λ = 1. Every rotation matrix must have this eigenvalue.

Determining the angle[edit]

To find the angle of a rotation, once the axis of the rotation is known, select a vector v perpendicular to the axis. Then the angle of the rotation is the angle between v and Rv.

A much easier method, however, is to calculate the trace (i.e. the sum of the diagonal elements of the rotation matrix) which is 1+2cosθ. Care should be taken to select the right sign for the angle θ to match the chosen axis.

Rotation matrix from axis and angle[edit]

For some applications, it is helpful to be able to make a rotation with a given axis. Given a unit vector u = (uxuyuz), where ux2 + uy2 + uz2 = 1, the matrix for a rotation by an angle of θ about an axis in the direction of u is

R = \begin{bmatrix} \cos \theta +u_x^2 \left(1-\cos \theta\right) & u_x u_y \left(1-\cos \theta\right) - u_z \sin \theta & u_x u_z \left(1-\cos \theta\right) + u_y \sin \theta \\ u_y u_x \left(1-\cos \theta\right) + u_z \sin \theta & \cos \theta + u_y^2\left(1-\cos \theta\right) & u_y u_z \left(1-\cos \theta\right) - u_x \sin \theta \\ u_z u_x \left(1-\cos \theta\right) - u_y \sin \theta & u_z u_y \left(1-\cos \theta\right) + u_x \sin \theta & \cos \theta + u_z^2\left(1-\cos \theta\right) 
\end{bmatrix}.
[4]

This can be written more concisely as

R = \cos\theta\mathbf I + \sin\theta[\mathbf u]_{\times} + (1-\cos\theta)\mathbf{u}\otimes\mathbf{u},

where [\mathbf u]_{\times} is the cross product matrix of u, \otimes is the tensor product and I is the Identity matrix. This is a matrix form of Rodrigues' rotation formula, with

 \mathbf{u}\otimes\mathbf{u}  = \begin{bmatrix}
u_x^2   & u_x u_y & u_x u_z \\[3pt]
u_x u_y & u_y^2 & u_y u_z \\[3pt]
u_x u_z & u_y u_z & u_z^2
\end{bmatrix},\qquad  [\mathbf u]_{\times} = \begin{bmatrix}
0  & -u_z & u_y \\[3pt]
u_z & 0 & -u_x \\[3pt]
-u_y & u_x & 0
\end{bmatrix}.

If the 3D space is right-handed, this rotation will be counterclockwise for an observer placed so that the axis u goes in her direction (Right-hand rule).

Properties of a rotation matrix[edit]

In three dimensions, for any rotation matrix R_{a,\theta} acting on \mathbb{R}^3 , where a is a rotation axis and θ a rotation angle,

\{1, e^{\pm i\theta} \} = \{1,\ \cos(\theta)+i\sin(\theta),\ \cos(\theta)-i\sin(\theta)\},
where i is the standard imaginary unit with the property i^2 = -1
  • The trace of R_{a,\theta} \ is 1 + 2\cos(\theta) \, equivalent to the sum of its eigenvalues.

Some of these properties can be generalised to any number of dimensions. In other words, they hold for any rotation matrix R_{a,\theta} \in \mathbb{R}^n .

For instance, in two dimensions the properties hold with the following exceptions:

  • a is not a given axis, but a point (rotation center) which must coincide with the origin of the coordinate system in which the rotation is represented.
  • Consequently, the four elements of the rotation matrix depend only on θ, hence we write R_{\theta} \ , rather than R_{a,\theta} \
  • The eigenvalues of R_{\theta} \ are \{e^{\pm i\theta} \} = \{\cos(\theta)+i\sin(\theta),\ \cos(\theta)-i\sin(\theta)\}.
  • The trace of R_{\theta} \ is 2\cos(\theta) \, equivalent to the sum of its eigenvalues.

Examples[edit]

Geometry[edit]

In Euclidean geometry, a rotation is an example of an isometry, a transformation that moves points without changing the distances between them. Rotations are distinguished from other isometries by two additional properties: they leave (at least) one point fixed, and they leave "handedness" unchanged. By contrast, a translation moves every point, a reflection exchanges left- and right-handed ordering, and a glide reflection does both.

A rotation that does not leave "handedness" unchanged is an improper rotation or a rotoinversion.

If we take the fixed point as the origin of a Cartesian coordinate system, then every point can be given coordinates as a displacement from the origin. Thus we may work with the vector space of displacements instead of the points themselves. Now suppose (p1,…,pn) are the coordinates of the vector p from the origin, O, to point P. Choose an orthonormal basis for our coordinates; then the squared distance to P, by Pythagoras, is

 d^2(O,P) = \| \bold{p} \|^2 = \sum_{r=1}^n p_r^2

which we can compute using the matrix multiplication

 \| \bold{p} \|^2 = \begin{bmatrix}p_1 \cdots p_n\end{bmatrix} \begin{bmatrix}p_1 \\ \vdots \\ p_n \end{bmatrix} = \bold{p}^T \bold{p} .

A geometric rotation transforms lines to lines, and preserves ratios of distances between points. From these properties we can show that a rotation is a linear transformation of the vectors, and thus can be written in matrix form, Qp. The fact that a rotation preserves, not just ratios, but distances themselves, we can state as

 \bold{p}^T \bold{p} = (Q \bold{p})^T (Q \bold{p}) , \,\!

or

\begin{align}
 \bold{p}^T  I \bold{p}&{}= (\bold{p}^T Q^T) (Q \bold{p}) \\
                    &{}= \bold{p}^T (Q^T Q) \bold{p} .
\end{align}

Because this equation holds for all vectors, p, we conclude that every rotation matrix, Q, satisfies the orthogonality condition,

 Q^T Q = I . \,\!

Rotations preserve handedness because they cannot change the ordering of the axes, which implies the special matrix condition,

 \det Q = +1 . \,\!

Equally important, we can show that any matrix satisfying these two conditions acts as a rotation.

Multiplication[edit]

The inverse of a rotation matrix is its transpose, which is also a rotation matrix:

\begin{align} (Q^T)^T (Q^T) &{}= Q Q^T = I\\ \det Q^T &{}= \det Q = +1. \end{align}

The product of two rotation matrices is a rotation matrix:

\begin{align}
 (Q_1 Q_2)^T (Q_1 Q_2) &{}= Q_2^T (Q_1^T Q_1) Q_2 = I \\
 \det (Q_1 Q_2) &{}= (\det Q_1) (\det Q_2) = +1.
\end{align}

For n greater than 2, multiplication of n×n rotation matrices is not commutative.

\begin{align}
Q_1 &{}= \begin{bmatrix}0 & -1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1\end{bmatrix} &
Q_2 &{}= \begin{bmatrix}0 & 0 & 1 \\ 0 & 1 & 0 \\ -1 & 0 & 0\end{bmatrix} \\
Q_1 Q_2 &{}= \begin{bmatrix}0 & -1 & 0 \\ 0 & 0 & 1 \\ -1 & 0 & 0\end{bmatrix} &
Q_2 Q_1 &{}= \begin{bmatrix}0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0\end{bmatrix}.
\end{align}

Noting that any identity matrix is a rotation matrix, and that matrix multiplication is associative, we may summarize all these properties by saying that the n×n rotation matrices form a group, which for n > 2 is non-abelian. Called a special orthogonal group, and denoted by SO(n), SO(n,R), SOn, or SOn(R), the group of n×n rotation matrices is isomorphic to the group of rotations in an n-dimensional space. This means that multiplication of rotation matrices corresponds to composition of rotations, applied in left-to-right order of their corresponding matrices.

Ambiguities[edit]

Alias and alibi rotations

The interpretation of a rotation matrix can be subject to many ambiguities.

In most cases the effect of the ambiguity is equivalent to the effect of a rotation matrix inversion (for these orthogonal matrices equivalently matrix transpose).

Alias or alibi (passive or active) transformation
The coordinates of a point P may change due to either a rotation of the coordinate system CS (alias), or a rotation of the point P (alibi). In the latter case, the rotation of P also produces a rotation of the vector v representing P. In other words, either P and v are fixed while CS rotates (alias), or CS is fixed while P and v rotate (alibi). Any given rotation can be legitimately described both ways, as vectors and coordinate systems actually rotate with respect to each other, about the same axis but in opposite directions. Throughout this article, we chose the alibi approach to describe rotations. For instance,

R(\theta) = \begin{bmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta \\
\end{bmatrix}
represents a counterclockwise rotation of a vector v by an angle θ, or a rotation of CS by the same angle but in the opposite direction (i.e. clockwise). Alibi and alias transformations are also known as active and passive transformations, respectively.
Pre-multiplication or post-multiplication
The same point P can be represented either by a column vector v or a row vector w. Rotation matrices can either pre-multiply column vectors (Rv), or post-multiply row vectors (wR). However, Rv produces a rotation in the opposite direction with respect to wR. Throughout this article, we described rotations produced on column vectors by means of a pre-multiplication. To obtain exactly the same rotation (i.e. the same final coordinates of point P), the row vector must be post-multiplied by the transpose of R (wRT).
Right- or left-handed coordinates
The matrix and the vector can be represented with respect to a right-handed or left-handed coordinate system. Throughout the article, we assumed a right-handed orientation, unless otherwise specified.
Vectors or forms
The vector space has a dual space of linear forms, and the matrix can act on either vectors or forms.

Decompositions[edit]

Independent planes[edit]

Consider the 3×3 rotation matrix

 Q = \begin{bmatrix} 0.36 & 0.48 & -0.8 \\ -0.8 & 0.60 & 0 \\ 0.48 & 0.64 & 0.60 \end{bmatrix} .

If Q acts in a certain direction, v, purely as a scaling by a factor λ, then we have

 Q \bold{v} = \lambda \bold{v}, \,\!

so that

 \bold{0} = (\lambda I - Q) \bold{v} . \,\!

Thus λ is a root of the characteristic polynomial for Q,

\begin{align}
 0 &{}= \det (\lambda I - Q) \\
   &{}= \lambda^3 - \tfrac{39}{25} \lambda^2  + \tfrac{39}{25} \lambda - 1 \\
   &{}= (\lambda-1) (\lambda^2 - \tfrac{14}{25} \lambda + 1).
\end{align}

Two features are noteworthy. First, one of the roots (or eigenvalues) is 1, which tells us that some direction is unaffected by the matrix. For rotations in three dimensions, this is the axis of the rotation (a concept that has no meaning in any other dimension). Second, the other two roots are a pair of complex conjugates, whose product is 1 (the constant term of the quadratic), and whose sum is 2 cos θ (the negated linear term). This factorization is of interest for 3×3 rotation matrices because the same thing occurs for all of them. (As special cases, for a null rotation the "complex conjugates" are both 1, and for a 180° rotation they are both −1.) Furthermore, a similar factorization holds for any n×n rotation matrix. If the dimension, n, is odd, there will be a "dangling" eigenvalue of 1; and for any dimension the rest of the polynomial factors into quadratic terms like the one here (with the two special cases noted). We are guaranteed that the characteristic polynomial will have degree n and thus n eigenvalues. And since a rotation matrix commutes with its transpose, it is a normal matrix, so can be diagonalized. We conclude that every rotation matrix, when expressed in a suitable coordinate system, partitions into independent rotations of two-dimensional subspaces, at most n2 of them.

The sum of the entries on the main diagonal of a matrix is called the trace; it does not change if we reorient the coordinate system, and always equals the sum of the eigenvalues. This has the convenient implication for 2×2 and 3×3 rotation matrices that the trace reveals the angle of rotation, θ, in the two-dimensional (sub-)space. For a 2×2 matrix the trace is 2 cos(θ), and for a 3×3 matrix it is 1+2 cos(θ). In the three-dimensional case, the subspace consists of all vectors perpendicular to the rotation axis (the invariant direction, with eigenvalue 1). Thus we can extract from any 3×3 rotation matrix a rotation axis and an angle, and these completely determine the rotation.

Sequential angles[edit]

The constraints on a 2×2 rotation matrix imply that it must have the form

Q = \begin{bmatrix} a & -b \\ b & a \end{bmatrix}

with a2+b2 = 1. Therefore we may set a = cos θ and b = sin θ, for some angle θ. To solve for θ it is not enough to look at a alone or b alone; we must consider both together to place the angle in the correct quadrant, using a two-argument arctangent function.

Now consider the first column of a 3×3 rotation matrix,

\begin{bmatrix}a\\b\\c\end{bmatrix} .

Although a2+b2 will probably not equal 1, but some value r2 < 1, we can use a slight variation of the previous computation to find a so-called Givens rotation that transforms the column to

\begin{bmatrix}r\\0\\c\end{bmatrix} ,

zeroing b. This acts on the subspace spanned by the x and y axes. We can then repeat the process for the xz subspace to zero c. Acting on the full matrix, these two rotations produce the schematic form

Q_{xz}Q_{xy}Q = \begin{bmatrix}1&0&0\\0&\ast&\ast\\0&\ast&\ast\end{bmatrix} .

Shifting attention to the second column, a Givens rotation of the yz subspace can now zero the z value. This brings the full matrix to the form

Q_{yz}Q_{xz}Q_{xy}Q = \begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix} ,

which is an identity matrix. Thus we have decomposed Q as

Q = Q_{xy}^{-1}Q_{xz}^{-1}Q_{yz}^{-1} .

An n×n rotation matrix will have (n−1)+(n−2)+⋯+2+1, or

\sum_{k=1}^{n-1} k = \frac{n(n-1)}{2} \,\!

entries below the diagonal to zero. We can zero them by extending the same idea of stepping through the columns with a series of rotations in a fixed sequence of planes. We conclude that the set of n×n rotation matrices, each of which has n2 entries, can be parameterized by n(n−1)/2 angles.

xzxw xzyw xyxw xyzw
yxyw yxzw yzyw yzxw
zyzw zyxw zxzw zxyw
xzxb yzxb xyxb zyxb
yxyb zxyb yzyb xzyb
zyzb xyzb zxzb yxzb

In three dimensions this restates in matrix form an observation made by Euler, so mathematicians call the ordered sequence of three angles Euler angles. However, the situation is somewhat more complicated than we have so far indicated. Despite the small dimension, we actually have considerable freedom in the sequence of axis pairs we use; and we also have some freedom in the choice of angles. Thus we find many different conventions employed when three-dimensional rotations are parameterized for physics, or medicine, or chemistry, or other disciplines. When we include the option of world axes or body axes, 24 different sequences are possible. And while some disciplines call any sequence Euler angles, others give different names (Euler, Cardano, Tait-Bryan, Roll-pitch-yaw) to different sequences.

One reason for the large number of options is that, as noted previously, rotations in three dimensions (and higher) do not commute. If we reverse a given sequence of rotations, we get a different outcome. This also implies that we cannot compose two rotations by adding their corresponding angles. Thus Euler angles are not vectors, despite a similarity in appearance as a triple of numbers.

Nested dimensions[edit]

A 3×3 rotation matrix like

Q_{3 \times 3} = \begin{bmatrix}\cos \theta & \sin \theta & {\color{CadetBlue}0} \\ -\sin \theta & \cos \theta & {\color{CadetBlue}0} \\ {\color{CadetBlue}0} & {\color{CadetBlue}0} & {\color{CadetBlue}1}\end{bmatrix}

suggests a 2×2 rotation matrix,

Q_{2 \times 2} = \begin{bmatrix}\cos \theta & \sin \theta \\ -\sin \theta & \cos \theta\end{bmatrix} ,

is embedded in the upper left corner:

Q_{3 \times 3} = \left[ \begin{matrix} Q_{2 \times 2} & \bold{0} \\ \bold{0}^T & 1 \end{matrix} \right] .

This is no illusion; not just one, but many, copies of n-dimensional rotations are found within (n+1)-dimensional rotations, as subgroups. Each embedding leaves one direction fixed, which in the case of 3×3 matrices is the rotation axis. For example, we have

Q_{\bold{x}}(\theta) = \begin{bmatrix}1 & 0 & 0 \\ 0 & \cos \theta & \sin \theta \\ 0 & -\sin \theta & \cos \theta\end{bmatrix} ,
Q_{\bold{y}}(\theta) = \begin{bmatrix}\cos \theta & 0 & -\sin \theta \\ 0 & 1 & 0 \\ \sin \theta & 0 & \cos \theta\end{bmatrix} ,
Q_{\bold{z}}(\theta) = \begin{bmatrix}\cos \theta & \sin \theta & 0 \\ -\sin \theta & \cos \theta & 0  \\ 0 & 0 & 1\end{bmatrix} ,

fixing the x axis, the y axis, and the z axis, respectively. The rotation axis need not be a coordinate axis; if u = (x,y,z) is a unit vector in the desired direction, then

\begin{align}
 Q_{\bold{u}}(\theta)
 &{}= 
  \begin{bmatrix}
    0&-z&y\\
    z&0&-x\\
    -y&x&0
  \end{bmatrix} \sin \theta + (I - \bold{u}\bold{u}^T) \cos \theta + \bold{u}\bold{u}^T \\
 &{}=
  \begin{bmatrix}
    (1-x^2) c_{\theta} + x^2 & - z math> d^2(O,P) = \| \bold{p} \|^2 = \sum_{r=1}^n p_r^2 s_{\theta} - x y c_{\theta} + x y & y s_{\theta} - x z c_{\theta} + x z \\
    z s_{\theta} - x y c_{\theta} + x y & (1-y^2) c_{\theta} + y^2 & -x s_{\theta} - y z c_{\theta} + y z \\
    -y s_{\theta} - x z c_{\theta} + x z & x s_{\theta} - y z c_{\theta} + y z & (1-z^2) c_{\theta} + z^2
  \end{bmatrix} \\
 &{}=
  \begin{bmatrix}
    x^2 (1-c_{\theta}) + c_{\theta} & x y (1-c_{\theta}) - z s_{\theta} & x z (1-c_{\theta}) + y s_{\theta} \\
    x y (1-c_{\theta}) + z s_{\theta} & y^2 (1-c_{\theta}) + c_{\theta} & y z (1-c_{\theta}) - x s_{\theta} \\
    x z (1-c_{\theta}) - y s_{\theta} & y z (1-c_{\theta}) + x s_{\theta} & z^2 (1-c_{\theta}) + c_{\theta}
  \end{bmatrix} , 
\end{align}

where cθ = cos θ, sθ = sin θ, is a rotation by angle θ leaving axis u fixed.

A direction in (n+1)-dimensional space will be a unit magnitude vector, which we may consider a point on a generalized sphere, Sn. Thus it is natural to describe the rotation group SO(n+1) as combining SO(n) and Sn. A suitable formalism is the fiber bundle,

 SO(n) \hookrightarrow SO(n+1) \to S^n , \,\!

where for every direction in the "base space", Sn, the "fiber" over it in the "total space", SO(n+1), is a copy of the "fiber space", SO(n), namely the rotations that keep that direction fixed.

Thus we can build an n×n rotation matrix by starting with a 2×2 matrix, aiming its fixed axis on S2 (the ordinary sphere in three-dimensional space), aiming the resulting rotation on S3, and so on up through Sn−1. A point on Sn can be selected using n numbers, so we again have n(n−1)/2 numbers to describe any n×n rotation matrix.

In fact, we can view the sequential angle decomposition, discussed previously, as reversing this process. The composition of n−1 Givens rotations brings the first column (and row) to (1,0,…,0), so that the remainder of the matrix is a rotation matrix of dimension one less, embedded so as to leave (1,0,…,0) fixed.

Skew parameters via Cayley's formula[edit]

Main article: Skew-symmetric matrix

When an n×n rotation matrix, Q, does not include −1 as an eigenvalue, so that none of the planar rotations of which it is composed are 180° rotations, then Q+I is an invertible matrix. Most rotation matrices fit this description, and for them we can show that (QI)(Q+I)−1 is a skew-symmetric matrix, A. Thus AT = −A; and since the diagonal is necessarily zero, and since the upper triangle determines the lower one, A contains n(n−1)/2 independent numbers. Conveniently, IA is invertible whenever A is skew-symmetric; thus we can recover the original matrix using the Cayley transform,

 A \mapsto (I+A)(I-A)^{-1} , \,\!

which maps any skew-symmetric matrix A to a rotation matrix. In fact, aside from the noted exceptions, we can produce any rotation matrix in this way. Although in practical applications we can hardly afford to ignore 180° rotations, the Cayley transform is still a potentially useful tool, giving a parameterization of most rotation matrices without trigonometric functions.

In three dimensions, for example, we have (Cayley 1846)

\begin{align}
 &\begin{bmatrix}0&-z&y\\z&0&-x\\-y&x&0\end{bmatrix} \mapsto {} \\
 &\quad \frac{1}{1+x^2+y^2+z^2}
 \begin{bmatrix}
 1+x^2-y^2-z^2 & 2 x y-2 z & 2 y+2 x z \\
 2 x y+2 z & 1-x^2+y^2-z^2 & 2 y z-2 x \\
 2 x z-2 y & 2 x+2 y z & 1-x^2-y^2+z^2
 \end{bmatrix} .
\end{align}

If we condense the skew entries into a vector, (x,y,z), then we produce a 90° rotation around the x axis for (1,0,0), around the y axis for (0,1,0), and around the z axis for (0,0,1). The 180° rotations are just out of reach; for, in the limit as x goes to infinity, (x,0,0) does approach a 180° rotation around the x axis, and similarly for other directions.

Decomposition into shears[edit]

For the 2D case, a rotation matrix can be decomposed into three shear matrices (Paeth 1986):

\begin{align}
 R(\theta)
 &{}= 
  \begin{bmatrix}
    1 & -\tan (\theta/2)\\
    0 & 1
  \end{bmatrix}
  \begin{bmatrix}
    1 & 0\\
    \sin \theta & 1  
  \end{bmatrix}
  \begin{bmatrix}
    1 & -\tan (\theta/2)\\
    0 & 1
  \end{bmatrix}
\end{align}

This is useful, for instance, in computer graphics, since shears can be implemented with fewer multiplication instructions than rotating a bitmap directly. On modern computers, this may not matter, but it can be relevant for very old or low-end microprocessors.

Group theory[edit]

Lie group[edit]

We have established that n×n rotation matrices form a group, the special orthogonal group, SO(n). This algebraic structure is coupled with a topological structure, in that the operations of multiplication and taking the inverse (which here is merely transposition) are continuous functions of the matrix entries. Thus SO(n) is a classic example of a topological group. (In purely topological terms, it is a compact manifold.) Furthermore, the operations are not only continuous, but smooth, so SO(n) is a differentiable manifold and a Lie group.[5]

Most properties of individual rotations of ℝn depend very little on the dimension n; nevertheless, in Lie group theory, we see systematic differences between even dimensions and odd dimensions. Furthermore, there are some isolated irregularities below n = 5; for example, SO(4) is, anomalously, not a simple Lie group. Instead its (only) double cover is isomorphic to the product S3 × S3. Either factor is clearly a normal subgroup, and hence so is its image under the double-covering homomorphism. This shows SO(4) is not simple.

Lie algebra[edit]

Associated with every Lie group is a Lie algebra, a linear space equipped with a bilinear alternating product called a bracket. The algebra for SO(n) is denoted by

 \mathfrak{so}(n) , \,\!

and consists of all skew-symmetric n×n matrices (as implied by differentiating the orthogonality condition, I = QTQ). The bracket, [A1,A2], of two skew-symmetric matrices is defined to be A1A2A2A1, which is again a skew-symmetric matrix. This Lie algebra bracket captures the essence of the Lie group product via infinitesimals.

For 2×2 rotation matrices, the Lie algebra so(2) is a one-dimensional vector space, mere multiples of

J = \begin{bmatrix}0&-1\\1&0\end{bmatrix} .

Here the bracket always vanishes, which tells us that, in two dimensions, rotations commute. Not so in any higher dimension.

For 3×3 rotation matrices, one has a three-dimensional vector space with the convenient basis


 A_{\bold{x}} = \begin{bmatrix}0&0&0\\0&0&-1\\0&1&0\end{bmatrix} , \quad
 A_{\bold{y}} = \begin{bmatrix}0&0&1\\0&0&0\\-1&0&0\end{bmatrix} , \quad
 A_{\bold{z}} = \begin{bmatrix}0&-1&0\\1&0&0\\0&0&0\end{bmatrix} .

The Lie brackets of these generators are as follows,


 [A_{\bold{x}}, A_{\bold{y}}] = A_{\bold{z}}, \quad
 [A_{\bold{z}}, A_{\bold{x}}] = A_{\bold{y}}, \quad
 [A_{\bold{y}}, A_{\bold{z}}] = A_{\bold{x}}.

We can conveniently identify any matrix in this Lie algebra with a vector in ℝ3,

\begin{align}
 \boldsymbol{\omega} &= (x,y,z) \\
 \boldsymbol{\tilde{\omega}}  &=\boldsymbol{\omega\cdot A} = x A_{\bold{x}} + y A_{\bold{y}} + z A_{\bold{z}} \\
                             &= \begin{bmatrix}0&-z&y\\z&0&-x\\-y&x&0\end{bmatrix} .
\end{align}

Under this identification, the so(3) bracket has a memorable description; it is the vector cross product,

 [\tilde{\bold{u}},\tilde{\bold{v}}] = \widetilde{\bold{u}\!\times\!\bold{v} } ~.

The matrix identified with a vector v is also memorable, because

 \tilde{\bold{v}} \bold{u} = \bold{v} \times \bold{u} . \,\!

Notice this implies that v is in the null space of the skew-symmetric matrix with which it is identified, because v×v is always the zero vector.

Exponential map[edit]

Connecting the Lie algebra to the Lie group is the exponential map, which is defined using the standard matrix exponential series for eA,[6]

\begin{align}
 \exp \colon \mathfrak{so}(n) &{}\to SO(n) \\
 A &{}\mapsto I + A + \tfrac{1}{2} A^2 + \tfrac{1}{6} A^3 + \cdots + \tfrac{1}{k!} A^k + \cdots \\
   &{}= \sum_{k=0}^{\infty} \frac{1}{k!} A^k
\end{align} ~.

For any skew-symmetric matrix A, exp(A) is always a rotation matrix.

Note that this exponential map of skew-symmetric matrices to rotation matrices is quite different from the Cayley transform discussed earlier, differing to 3rd order,

e^{2A} - \frac{I+A}{I-A}=- \frac{2}{3} A^3 +\mathrm{O}  (A^4)  ~.

An important practical example is the 3×3 case, where we have seen we can identify every skew-symmetric matrix with a vector ω = θ u, where u = (x,y,z) is a unit magnitude vector. Recall that u is in the null space of the matrix associated with ω; so that, if we use a basis with u as the z axis, the final column and row will be zero. Thus, we know in advance that the exponential matrix must leave u fixed. It is mathematically impossible to supply a straightforward formula for such a basis as a function of u (its existence would violate the hairy ball theorem); but direct exponentiation is possible, and yields

\begin{align}
 \exp( \tilde{\boldsymbol{\omega}} )
 &{}= \exp \left( \begin{bmatrix} 0 & -z \theta & y \theta \\ z \theta & 0&-x \theta \\ -y \theta & x \theta & 0 \end{bmatrix} \right)= \boldsymbol{I} + 2cs~\boldsymbol{\tilde{u}\cdot A} + 2s^2 ~(\boldsymbol{\tilde{u}\cdot A} )^2 =\\
 &{}= \begin{bmatrix}
    2 (x^2 - 1) s^2 + 1 & 2 x y s^2 - 2 z c s & 2 x z s^2 + 2 y c s \\
    2 x y s^2 + 2 z c s & 2 (y^2 - 1) s^2 + 1 & 2 y z s^2 - 2 x c s \\
    2 x z s^2 - 2 y c s & 2 y z s^2 + 2 x c s & 2 (z^2 - 1) s^2 + 1
  \end{bmatrix} ,
\end{align}

where c = cos θ2, s = sin θ2.

We recognize this as our matrix for a rotation around axis u by the angle θ: cf. Rodrigues' rotation formula.

In any dimension, if we choose some nonzero A and consider all its scalar multiples, exponentiation yields rotation matrices along a geodesic of the group manifold, forming a one-parameter subgroup of the Lie group.

More broadly, the exponential map provides a homeomorphism between a neighborhood of the origin in the Lie algebra and a neighborhood of the identity in the Lie group. In fact, we can produce any rotation matrix as the exponential of some skew-symmetric matrix, so for these groups the exponential map is a surjection.

Baker–Campbell–Hausdorff formula[edit]

Suppose we are given A and B in the Lie algebra. Their exponentials, exp(A) and exp(B), are rotation matrices, which we can multiply. Since the exponential map is a surjection, we know that, for some C in the Lie algebra, exp(A)exp(B) = exp(C), and so we may write

 A \ast B = C~ .

When exp(A) and exp(B) commute, then C = A+B, mimicking the behavior of complex exponentiation. However, the general case is given by the more elaborate BCH formula, a series expansion of nested brackets.[7] For matrices, the bracket is the same operation as the commutator, which monitors lack of commutativity in multiplication. This general expansion unfolds as follows,

 A \ast B = A + B + \tfrac12 [A,B] + \tfrac{1}{12} [A,[A,B]] - \tfrac{1}{12} [B,[A,B]] + \cdots ~.

Representation of a rotation matrix as a sequential angle decomposition, as in Euler angles, may tempt one to treat rotations as a vector space, but the higher order terms in the BCH formula deprecate such an approach for large angles.

We again take special interest in the 3×3 case, where [A,B] equals the cross product, A×B. If A and B are linearly independent, then A, B, and A×B provide a complete basis; if not, then A and B commute. Evidently, in this dimension, the infinite expansion in the BCH formula for group composition has a compact form, as C = αA+βB+γA×B for suitable coefficients.[8]

(Also see the straightforward 2×2 derivation for SU(2). For the general n×n case, use.[9])

Spin group[edit]

The Lie group of n×n rotation matrices, SO(n), is a compact and path-connected manifold, and thus locally compact and connected. However, it is not simply connected, so Lie theory tells us it is a kind of "shadow" (a homomorphic image) of a universal covering group. Often the covering group, which in this case is the spin group denoted by Spin(n), is simpler and more natural to work with.[10]

In the case of planar rotations, SO(2) is topologically a circle, S1. Its universal covering group, Spin(2), is isomorphic to the real line, R, under addition. In other words, whenever we use angles of arbitrary magnitude, which we often do, we are essentially taking advantage of the convenience of the "mother space". Every 2×2 rotation matrix is produced by a countable infinity of angles, separated by integer multiples of 2π. Correspondingly, the fundamental group of SO(2) is isomorphic to the integers, Z.

In the case of spatial rotations, SO(3) is topologically equivalent to three-dimensional real projective space, RP3. Its universal covering group, Spin(3), is isomorphic to the 3-sphere, S3. Every 3×3 rotation matrix is produced by two opposite points on the sphere. Correspondingly, the fundamental group of SO(3) is isomorphic to the two-element group, Z2. We can also describe Spin(3) as isomorphic to quaternions of unit norm under multiplication, or to certain 4×4 real matrices, or to 2×2 complex special unitary matrices.

Concretely, a unit quaternion, q, with

\begin{align}
 q &{}= w + \bold{i}x + \bold{j}y + \bold{k}z , \\
 1 &{}= w^2 + x^2 + y^2 + z^2 ,
\end{align}

produces the rotation matrix

 Q = \begin{bmatrix}
    1 - 2 y^2 - 2 z^2 & 2 x y - 2 z w & 2 x z + 2 y w \\
    2 x y + 2 z w & 1 - 2 x^2 - 2 z^2 & 2 y z - 2 x w \\
    2 x z - 2 y w & 2 y z + 2 x w & 1 - 2 x^2 - 2 y^2
\end{bmatrix} .

This is our third version of this matrix, here as a rotation around the now non-unit axis vector (x,y,z) by angle 2θ, where cos θ = w and |sin θ| = ||(x,y,z)||. (The proper sign for sin θ is implied once the signs of the axis components are fixed.)

Many features of this case are the same for higher dimensions. The coverings are all two-to-one, with SO(n), n > 2, having fundamental group Z2. The natural setting for these groups is within a Clifford algebra. And the action of the rotations is produced by a kind of "sandwich", denoted by qvq.

Infinitesimal rotations[edit]

The matrices in the Lie algebra are not themselves rotations; the skew-symmetric matrices are derivatives, proportional differences of rotations. An actual "differential rotation", or infinitesimal rotation matrix has the form

 I + A \, d\theta ~,

where is vanishingly small.

These matrices do not satisfy all the same properties as ordinary finite rotation matrices under the usual treatment of infinitesimals .[11] To understand what this means, consider

 dA_{\bold{x}} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & -d\theta \\ 0 & d\theta & 1 \end{bmatrix}~ .

First, test the orthogonality condition, QTQ = I. The product is

 dA_{\bold{x}}^T \, dA_{\bold{x}} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1+d\theta^2 & 0 \\ 0 & 0 & 1+d\theta^2 \end{bmatrix} ,

differing from an identity matrix by second order infinitesimals, discarded here. So, to first order, an infinitesimal rotation matrix is an orthogonal matrix.

Next, examine the square of the matrix,

 dA_{\bold{x}}^2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1-d\theta^2 & -2d\theta \\ 0 & 2d\theta & 1-d\theta^2 \end{bmatrix}~.

Again discarding second order effects, note that the angle simply doubles. This hints at the most essential difference in behavior, which we can exhibit with the assistance of a second infinitesimal rotation,

 dA_{\bold{y}} = \begin{bmatrix} 1 & 0 & d\phi \\ 0 & 1 & 0 \\ -d\phi & 0 & 1 \end{bmatrix} .

Compare the products dAxdAy to dAydAx,

\begin{align}
 dA_{\bold{x}}\,dA_{\bold{y}} &{}= \begin{bmatrix} 1 & 0 & d\phi \\ d\theta\,d\phi & 1 & -d\theta \\ -d\phi & d\theta & 1 \end{bmatrix} \\
 dA_{\bold{y}}\,dA_{\bold{x}} &{}= \begin{bmatrix} 1 & d\theta\,d\phi & d\phi \\ 0 & 1 & -d\theta \\ -d\phi & d\theta & 1 \end{bmatrix}. \\
\end{align}

Since dθ dφ is second order, we discard it: thus, to first order, multiplication of infinitesimal rotation matrices is commutative. In fact,

 dA_{\bold{x}}\,dA_{\bold{y}} = dA_{\bold{y}}\,dA_{\bold{x}} , \,\!

again to first order. In other words, the order in which infinitesimal rotations are applied is irrelevant.

This useful fact makes, for example, derivation of rigid body rotation relatively simple. But we must always be careful to distinguish (the first order treatment of) these infinitesimal rotation matrices from both finite rotation matrices and from derivatives of rotation matrices (namely skew-symmetric matrices). Contrast the behavior of finite rotation matrices in the BCH formula above with that of infinitesimal rotation matrices, where all the commutator terms will be second order infinitesimals so we do have a bona fide vector space. (Technically, this dismissal of any second order terms amounts to Group contraction.)

Conversions[edit]

We have seen the existence of several decompositions that apply in any dimension, namely independent planes, sequential angles, and nested dimensions. In all these cases we can either decompose a matrix or construct one. We have also given special attention to 3×3 rotation matrices, and these warrant further attention, in both directions (Stuelpnagel 1964).

Quaternion[edit]

Given the unit quaternion q = (w,x,y,z), the equivalent left-handed (Post-Multiplied) 3×3 rotation matrix is

 Q = \begin{bmatrix}
    1 - 2 y^2 - 2 z^2 & 2 x y - 2 z w & 2 x z + 2 y w \\
    2 x y + 2 z w & 1 - 2 x^2 - 2 z^2 & 2 y z - 2 x w \\
    2 x z - 2 y w & 2 y z + 2 x w & 1 - 2 x^2 - 2 y^2
\end{bmatrix} .

Now every quaternion component appears multiplied by two in a term of degree two, and if all such terms are zero what's left is an identity matrix. This leads to an efficient, robust conversion from any quaternion – whether unit or non-unit – to a 3×3 rotation matrix.

n = w * w + x * x + y * y + z * z
s = if n == 0 then 0 else 2 / n
wx = s * w * x, wy = s * w * y, wz = s * w * z
xx = s * x * x, xy = s * x * y, xz = s * x * z
yy = s * y * y, yz = s * y * z, zz = s * z * z

[ 1 - (yy + zz)         xy - wz          xz + wy  ]
[      xy + wz     1 - (xx + zz)         yz - wx  ]
[      xz - wy          yz + wx     1 - (xx + yy) ]

Freed from the demand for a unit quaternion, we find that nonzero quaternions act as homogeneous coordinates for 3×3 rotation matrices. The Cayley transform, discussed earlier, is obtained by scaling the quaternion so that its w component is 1. For a 180° rotation around any axis, w will be zero, which explains the Cayley limitation.

The sum of the entries along the main diagonal (the trace), plus one, equals 4−4(x2+y2+z2), which is 4w2. Thus we can write the trace itself as 2w2+2w2−1; and from the previous version of the matrix we see that the diagonal entries themselves have the same form: 2x2+2w2−1, 2y2+2w2−1, and 2z2+2w2−1. So we can easily compare the magnitudes of all four quaternion components using the matrix diagonal. We can, in fact, obtain all four magnitudes using sums and square roots, and choose consistent signs using the skew-symmetric part of the off-diagonal entries.

t = Qxx+Qyy+Qzz (trace of Q)
r = sqrt(1+t)
w = 0.5*r
x = copysign(0.5*sqrt(1+Qxx-Qyy-Qzz), Qzy-Qyz)
y = copysign(0.5*sqrt(1-Qxx+Qyy-Qzz), Qxz-Qzx)
z = copysign(0.5*sqrt(1-Qxx-Qyy+Qzz), Qyx-Qxy)

where copysign(x,y) is x with the sign of y:

\operatorname{copysign}(x,y) = \operatorname{sign}(y) \; |x|.

Alternatively, use a single square root and division

t = Qxx+Qyy+Qzz
r = sqrt(1+t)
s = 0.5/r
w = 0.5*r
x = (Qzy-Qyz)*s
y = (Qxz-Qzx)*s
z = (Qyx-Qxy)*s

This is numerically stable so long as the trace, t, is not negative; otherwise, we risk dividing by (nearly) zero. In that case, suppose Qxx is the largest diagonal entry, so x will have the largest magnitude (the other cases are similar); then the following is safe.

t = Qxx+Qyy+Qzz
r = sqrt(1+Qxx-Qyy-Qzz)
s = 0.5/r
w = (Qzy-Qyz)*s
x = 0.5*r
y = (Qxy+Qyx)*s
z = (Qzx+Qxz)*s

If the matrix contains significant error, such as accumulated numerical error, we may construct a symmetric 4×4 matrix,

 K = \frac13
 \begin{bmatrix}
  Q_{xx}-Q_{yy}-Q_{zz} & Q_{yx}+Q_{xy} & Q_{zx}+Q_{xz} & Q_{yz}-Q_{zy} \\
  Q_{yx}+Q_{xy} & Q_{yy}-Q_{xx}-Q_{zz} & Q_{zy}+Q_{yz} & Q_{zx}-Q_{xz} \\
  Q_{zx}+Q_{xz} & Q_{zy}+Q_{yz} & Q_{zz}-Q_{xx}-Q_{yy} & Q_{xy}-Q_{yx} \\
  Q_{yz}-Q_{zy} & Q_{zx}-Q_{xz} & Q_{xy}-Q_{yx} & Q_{xx}+Q_{yy}+Q_{zz}
 \end{bmatrix} ,

and find the eigenvector, (x,y,z,w), of its largest magnitude eigenvalue. (If Q is truly a rotation matrix, that value will be 1.) The quaternion so obtained will correspond to the rotation matrix closest to the given matrix(Bar-Itzhack 2000).

Polar decomposition[edit]

If the n×n matrix M is non-singular, its columns are linearly independent vectors; thus the Gram–Schmidt process can adjust them to be an orthonormal basis. Stated in terms of numerical linear algebra, we convert M to an orthogonal matrix, Q, using QR decomposition. However, we often prefer a Q "closest" to M, which this method does not accomplish. For that, the tool we want is the polar decomposition (Fan & Hoffman 1955; Higham 1989).

To measure closeness, we may use any matrix norm invariant under orthogonal transformations. A convenient choice is the Frobenius norm, ||QM||F, squared, which is the sum of the squares of the element differences. Writing this in terms of the trace, Tr, our goal is,

  • Find Q minimizing Tr( (QM)T(QM) ), subject to QTQ = I.

Though written in matrix terms, the objective function is just a quadratic polynomial. We can minimize it in the usual way, by finding where its derivative is zero. For a 3×3 matrix, the orthogonality constraint implies six scalar equalities that the entries of Q must satisfy. To incorporate the constraint(s), we may employ a standard technique, Lagrange multipliers, assembled as a symmetric matrix, Y. Thus our method is:

  • Differentiate Tr( (QM)T(QM) + (QTQI)Y ) with respect to (the entries of) Q, and equate to zero.

Consider a 2×2 example. Including constraints, we seek to minimize

\begin{align}
 &\scriptstyle{ (Q_{xx}-M_{xx})^2 + (Q_{xy}-M_{xy})^2 } \\
 &\scriptstyle{ {} + (Q_{yx}-M_{yx})^2 + (Q_{yy}-M_{yy})^2 } \\
 &\scriptstyle{ {} + (Q_{xx}^2+Q_{yx}^2-1)Y_{xx} + (Q_{xy}^2+Q_{yy}^2-1)Y_{yy} } \\
 &\scriptstyle{ {} + 2(Q_{xx} Q_{xy} + Q_{yx} Q_{yy})Y_{xy} . }
\end{align}

Taking the derivative with respect to Qxx, Qxy, Qyx, Qyy in turn, we assemble a matrix.

\scriptstyle{ 2
\begin{bmatrix}
\scriptstyle{ Q_{xx}-M_{xx} + Q_{xx} Y_{xx} + Q_{xy} Y_{xy} } & \scriptstyle{ Q_{xy}-M_{xy} + Q_{xx} Y_{xy} + Q_{xy} Y_{yy} } \\
\scriptstyle{ Q_{yx}-M_{yx} + Q_{yx} Y_{xx} + Q_{yy} Y_{xy} } & \scriptstyle{ Q_{yy}-M_{yy} + Q_{yx} Y_{xy} + Q_{yy} Y_{yy} }
\end{bmatrix}}

In general, we obtain the equation

 0 = 2(Q-M) + 2QY , \,\!

so that

 M = Q(I+Y) = QS , \,\!

where Q is orthogonal and S is symmetric. To ensure a minimum, the Y matrix (and hence S) must be positive definite. Linear algebra calls QS the polar decomposition of M, with S the positive square root of S2 = MTM.

 S^2 = (Q^T M)^T (Q^T M) = M^T Q Q^ T M = M^T M \,\!

When M is non-singular, the Q and S factors of the polar decomposition are uniquely determined. However, the determinant of S is positive because S is positive definite, so Q inherits the sign of the determinant of M. That is, Q is only guaranteed to be orthogonal, not a rotation matrix. This is unavoidable; an M with negative determinant has no uniquely defined closest rotation matrix.

Axis and angle[edit]

To efficiently construct a rotation matrix Q from an angle θ and a unit axis u, we can take advantage of symmetry and skew-symmetry within the entries. If x, y, and z are the components of the unit vector representing the axis, and

\begin{align}
c &=& \cos \theta\\
s &=& \sin \theta\\
C &=& 1-c\end{align}

then

Q(\theta) = \begin{bmatrix}
xxC+c  & xyC-zs & xzC+ys\\
yxC+zs & yyC+c  & yzC-xs\\
zxC-ys & zyC+xs & zzC+c
\end{bmatrix}

Determining an axis and angle, like determining a quaternion, is only possible up to sign; that is, (u,θ) and (−u,−θ) correspond to the same rotation matrix, just like q and −q. As well, axis-angle extraction presents additional difficulties. The angle can be restricted to be from 0° to 180°, but angles are formally ambiguous by multiples of 360°. When the angle is zero, the axis is undefined. When the angle is 180°, the matrix becomes symmetric, which has implications in extracting the axis. Near multiples of 180°, care is needed to avoid numerical problems: in extracting the angle, a two-argument arctangent with atan2(sin θ,cos θ) equal to θ avoids the insensitivity of arccosine; and in computing the axis magnitude in order to force unit magnitude, a brute-force approach can lose accuracy through underflow (Moler & Morrison 1983).

A partial approach is as follows:

\begin{align}
 x &=& Q_{zy} - Q_{yz}\\
 y &=& Q_{xz} - Q_{zx}\\
 z &=& Q_{yx} - Q_{xy}\\
 r &=& \sqrt{x^2 + y^2 + z^2}\\
 t &=& Q_{xx} + Q_{yy} + Q_{zz}\\
 \theta &=& \mbox{atan2}(r,t-1)\end{align}

The x, y, and z components of the axis would then be divided by r. A fully robust approach will use different code when t, the trace of the matrix Q, is negative, as with quaternion extraction. When r is zero because the angle is zero, an axis must be provided from some source other than the matrix.

Euler angles[edit]

Complexity of conversion escalates with Euler angles (used here in the broad sense). The first difficulty is to establish which of the twenty-four variations of Cartesian axis order we will use. Suppose the three angles are θ1, θ2, θ3; physics and chemistry may interpret these as

 Q(\theta_1,\theta_2,\theta_3)=  Q_{\bold{x}}(\theta_1) Q_{\bold{y}}(\theta_2) Q_{\bold{z}}(\theta_3) , \,\!

while aircraft dynamics may use

 Q(\theta_1,\theta_2,\theta_3)=  Q_{\bold{z}}(\theta_3) Q_{\bold{y}}(\theta_2) Q_{\bold{x}}(\theta_1) . \,\!

One systematic approach begins with choosing the right-most axis. Among all permutations of (x,y,z), only two place that axis first; one is an even permutation and the other odd. Choosing parity thus establishes the middle axis. That leaves two choices for the left-most axis, either duplicating the first or not. These three choices gives us 3×2×2 = 12 variations; we double that to 24 by choosing static or rotating axes.

This is enough to construct a matrix from angles, but triples differing in many ways can give the same rotation matrix. For example, suppose we use the zyz convention above; then we have the following equivalent pairs:

(90°, 45°, −105°) (−270°, −315°, 255°) multiples of 360°
(72°, 0°, 0°) (40°, 0°, 32°) singular alignment
(45°, 60°, −30°) (−135°, −60°, 150°) bistable flip

Angles for any order can be found using a concise common routine (Herter & Lott 1993; Shoemake 1994).

The problem of singular alignment, the mathematical analog of physical gimbal lock, occurs when the middle rotation aligns the axes of the first and last rotations. It afflicts every axis order at either even or odd multiples of 90°. These singularities are not characteristic of the rotation matrix as such, and only occur with the usage of Euler angles.

The singularities are avoided when considering and manipulating the rotation matrix as orthonormal row vectors (in 3D applications often named 'right'-vector, 'up'-vector and 'out'-vector) instead of as angles. The singularities are also avoided when working with quaternions.

Uniform random rotation matrices[edit]

We sometimes need to generate a uniformly distributed random rotation matrix. It seems intuitively clear in two dimensions that this means the rotation angle is uniformly distributed between 0 and 2π. That intuition is correct, but does not carry over to higher dimensions. For example, if we decompose 3×3 rotation matrices in axis-angle form, the angle should not be uniformly distributed; the probability that (the magnitude of) the angle is at most θ should be 1π(θ − sin θ), for 0 ≤ θ ≤ π.

Since SO(n) is a connected and locally compact Lie group, we have a simple standard criterion for uniformity, namely that the distribution be unchanged when composed with any arbitrary rotation (a Lie group "translation"). This definition corresponds to what is called Haar measure. León, Massé & Rivest (2006) show how to use the Cayley transform to generate and test matrices according to this criterion.

We can also generate a uniform distribution in any dimension using the subgroup algorithm of Diaconis & Shashahani (1987). This recursively exploits the nested dimensions group structure of SO(n), as follows. Generate a uniform angle and construct a 2×2 rotation matrix. To step from n to n+1, generate a vector v uniformly distributed on the n-sphere, Sn, embed the n×n matrix in the next larger size with last column (0,…,0,1), and rotate the larger matrix so the last column becomes v.

As usual, we have special alternatives for the 3×3 case. Each of these methods begins with three independent random scalars uniformly distributed on the unit interval. Arvo (1992) takes advantage of the odd dimension to change a Householder reflection to a rotation by negation, and uses that to aim the axis of a uniform planar rotation.

Another method uses unit quaternions. Multiplication of rotation matrices is homomorphic to multiplication of quaternions, and multiplication by a unit quaternion rotates the unit sphere. Since the homomorphism is a local isometry, we immediately conclude that to produce a uniform distribution on SO(3) we may use a uniform distribution on S3.

Euler angles can also be used, though not with each angle uniformly distributed (Murnaghan 1962; Miles 1965).

For the axis-angle form, the axis is uniformly distributed over the unit sphere of directions, S2, while the angle has the non-uniform distribution over [0,π] noted previously (Miles 1965).

See also[edit]

Notes[edit]

  1. ^ Swokowski, Earl (1979). Calculus with Analytic Geometry (Prindle, Weber, and Schmidt). 
  2. ^ W3C recommendation (2003), Scalable Vector Graphics – the initial coordinate system 
  3. ^ Note that if instead of rotating vectors, it is the reference frame that is being rotated, the signs on the sinθ terms will be reversed. If reference frame A is rotated anti-clockwise about the origin through an angle θ to create reference frame B, then R_x (with the signs flipped) will transform a vector described in reference frame A coordinates to reference frame B coordinates.
  4. ^ Taylor, Camillo; Kriegman (1994). "Minimization on the Lie Group SO(3) and Related Manifolds". Technical Report. No. 9405 (Yale University). 
  5. ^ Baker (2003); Fulton & Harris (1991)
  6. ^ (Wedderburn 1934, §8.02)
  7. ^ Hall 2004, Ch. 3; Varadarajan 1984, §2.15
  8. ^ (Engø 2001)
  9. ^ Curtright, T L; Fairlie, D B; Zachos, C K (2014). "A compact formula for rotations as spin matrix polynomials". SIGMA 10: 084. doi:10.3842/SIGMA.2014.084. 
  10. ^ Baker 2003, Ch. 5; Fulton & Harris 1991, pp. 299–315
  11. ^ (Goldstein, Poole & Safko 2002, §4.8)

References[edit]

External links[edit]