Covariance and contravariance of vectors

For other uses of "covariant" or "contravariant", see covariance and contravariance.

In multilinear algebra and tensor analysis, covariance and contravariance describe how the quantitative description of certain geometrical or physical entities changes when passing from one coordinate system to another. The coordinates of a geometrical vector can be measured with respect to a given system of rigid rods (or a basis). For the vector itself to be invariant, or independent of the arbitrary choice of basis, its coordinates must vary oppositely – or contra-vary – to compensate for a change in the basis: that is, the coordinates of a vector are contravariant. While components of vectors transform contravariantly, components of dual vectors transform covariantly.

The distinction is particularly important for computations with tensors, which often have mixed variance. This means that they have both covariant and contravariant components, or both vectors and dual vectors. The valence of a tensor is number of variant and covariant terms, and in Einstein notation, covariant components have lower indices, while contravariant components have upper indices. The duality between covariance and contravariance intervenes whenever a vector or tensor quantity is represented by its components, although modern differential geometry uses more sophisticated index-free methods to represent tensors.

The terms covariant and contravariant were introduced by J.J. Sylvester in 1853 in order to study algebraic invariant theory. In this context, for instance, a system of simultaneous equations is contravariant in the variables. The use of both terms in the modern context of multilinear algebra is a specific example of corresponding notions in category theory.

Introduction

In physics, a vector typically arises as the outcome of a measurement or series of measurements, and is represented as a list (or tuple) of numbers such as

(v_{1},v_{2},v_{3}).

This list of numbers depends on the choice of coordinate system of the laboratory. For instance, if the vector represents position with respect to an observer, then the coordinate system may be obtained from a system of rigid rods, or reference axes, along which the components v₁, v₂, and v₃ are measured. For a vector to represent a geometrical object, it must be possible to describe how it looks in any other coordinate system. That is to say, the components of the vectors will transform in a certain way in passing from one coordinate system to another.

A contravariant vector is required to have components that "transform like the coordinates" under changes of coordinates such as rotation and dilation. The vector itself does not change under these operations; instead, the components of the vector make a change that cancels the change in the spatial axes, in the same way that co-ordinates change. In other words, if the reference axes were rotated in one direction, the component representation of the vector would rotate in exactly the opposite way. Similarly, if the reference axes were stretched in one direction, the components of the vector, like the co-ordinates, would reduce in an exactly compensating way. Mathematically, if the coordinate system undergoes a transformation described by an invertible matrix M, so that a coordinate vector x is transformed to x′ = Mx, then a contravariant vector v must be similarly transformed via v′ = Mv. This important requirement is what distinguishes a contravariant vector from any other triple of physically meaningful quantities. For example, if v consists of the x, y, and z-components of velocity, then v is a contravariant vector: When space is stretched, rotated, or twisted, then the components of the velocity transform in the same way as space. On the other hand, for instance, a triple consisting of the length, width, and height of a rectangular box could make up the three components of an abstract vector, but this vector would not be contravariant, since rotating the box does not change the box's length, width, and height. Examples of contravariant vectors include displacement, velocity, electric field, momentum, force, and acceleration.

By contrast, a covariant vector has components that change oppositely to the coordinates or, equivalently, transform like the reference axes. For instance, the components of the gradient vector of a function

\nabla f={\frac {\partial f}{\partial x_{1}}}{\widehat {x_{1}}}+{\frac {\partial f}{\partial x_{2}}}{\widehat {x_{2}}}+{\frac {\partial f}{\partial x_{3}}}{\widehat {x_{3}}}

transform like the reference axes themselves.

Definition

The general formulation of covariance and contravariance refers to how the components of a coordinate vector transform under a change of basis. Thus let V be a vector space of dimension n over the field of scalars S, and let each of f = (X₁,...,X_n) and f' = (Y₁,...,Y_n) be a basis of V.^[1] Also, let the change of basis from f to f′ be given by

\mathbf {f} \mapsto \mathbf {f} '=\left(\sum _{i}a_{1}^{i}X_{i},\dots ,\sum _{i}a_{n}^{i}X_{i}\right)=\mathbf {f} A

(1)

for some invertible n×n matrix A with entries $a_{j}^{i}$ . Here, each vector Y_j of the f' basis is a linear combination of the vectors X_i of the f basis, so that

Y_{j}=\sum _{i}a_{j}^{i}X_{i}

.

Template:Multicol

Contravariant transformation

A vector v in V is expressed uniquely as a linear combination of the elements of the f basis as

v=\sum _{i}v^{i}[\mathbf {f} ]X_{i}

,

(2)

where vⁱ[f] are scalars in S known as the components of v in the f basis. Denote the column vector of components of v by v[f]:

\mathbf {v} [\mathbf {f} ]={\begin{bmatrix}v^{1}[\mathbf {f} ]\\v^{2}[\mathbf {f} ]\\\vdots \\v^{n}[\mathbf {f} ]\end{bmatrix}}

so that (2) can be rewritten as a matrix product

v=\mathbf {f} \,\mathbf {v} [\mathbf {f} ].

The vector v may also be expressed in terms of the f' basis, so that

v=\mathbf {f'} \,\mathbf {v} [\mathbf {f'} ]

.

However, since the vector v itself is invariant under the choice of basis,

\mathbf {f} \,\mathbf {v} [\mathbf {f} ]=v=\mathbf {f'} \,\mathbf {v} [\mathbf {f'} ].

The invariance of v combined with the relationship (1) between f and f' implies that

\mathbf {f} \,\mathbf {v} [\mathbf {f} ]=\mathbf {f} A\,\mathbf {v} [\mathbf {f} A],

giving the transformation rule

\mathbf {v} [\mathbf {f} A]=A^{-1}\mathbf {v} [\mathbf {f} ].

In terms of components,

v^{i}[\mathbf {f} A]=\sum _{j}{\tilde {a}}_{j}^{i}v^{j}[\mathbf {f} ]

where the coefficients ${\tilde {a}}_{j}^{i}$ are the entries of the inverse matrix of A.

Because the components of the vector v transform with the inverse of the matrix A, these components are said to transform contravariantly under a change of basis.

Template:Multicol-break

Covariant transformation

A linear functional α on V is expressed uniquely in terms of its components (scalars in S) in the f basis as

\alpha _{i}[\mathbf {f} ]=\alpha (X_{i}),\quad i=1,2,\dots ,n.

These components are the action of α on the basis vectors X_i of the f basis.

Under the change of basis from f to f' (1), the components transform such that

{\begin{array}{rcl}\alpha _{i}[\mathbf {f} A]&=&\alpha (Y_{i})\\&=&\alpha \left(\sum _{j}a_{i}^{j}X_{j}\right)\\&=&\sum _{j}a_{i}^{j}\alpha (X_{j})\\&=&\sum _{j}a_{i}^{j}\alpha _{j}[\mathbf {f} ]\end{array}}

.

(3)

Denote the row vector of components of α by α[f]:

\mathbf {\alpha } [\mathbf {f} ]={\begin{bmatrix}\alpha _{1}[\mathbf {f} ]&\alpha _{2}[\mathbf {f} ]&\dots &\alpha _{n}[\mathbf {f} ]\end{bmatrix}}

so that (3) can be rewritten as the matrix product

\alpha [\mathbf {f} A]=\alpha [\mathbf {f} ]A.

Because the components of the linear functional α transform with the matrix A, these components are said to transform covariantly under a change of basis.

Had a column vector representation been used instead, the transformation law would be the transpose

\alpha ^{T}[\mathbf {f} A]=A^{T}\alpha ^{T}[\mathbf {f} ].

Template:Multicol-end

Coordinates

The choice of basis f on the vector space V defines uniquely a set of coordinate functions on V, by means of

x^{i}[\mathbf {f} ](v)=v^{i}[\mathbf {f} ].

The coordinates on V are therefore contravariant in the sense that

x^{i}[\mathbf {f} A]=\sum _{k=1}^{n}{\tilde {a}}_{k}^{i}x^{k}[\mathbf {f} ].

Conversely, a system of n quantities vⁱ that transform like the coordinates xⁱ on V defines a contravariant vector. A system of n quantities that transform oppositely to the coordinates is then a covariant vector.

This formulation of contravariance and covariance is often more natural in applications, in which there is a coordinate space (a manifold) on which vectors live as tangent vectors or cotangent vectors. Given a local coordinate system xⁱ on the manifold, the reference axes for the coordinate system are the vector fields

X_{1}={\frac {\partial }{\partial x^{1}}},\dots ,X_{n}={\frac {\partial }{\partial x^{n}}}.

This gives rise to the frame f = (X₁,...,X_n) at every point of the coordinate patch.

If yⁱ is a different coordinate system and

Y_{1}={\frac {\partial }{\partial y^{1}}},\dots ,Y_{n}={\frac {\partial }{\partial y^{n}}},

then the frame f' is related to the frame f by the inverse of the Jacobian matrix of the coordinate transition:

\mathbf {f} '=\mathbf {f} J^{-1},\quad J=\left({\frac {\partial y^{i}}{\partial x^{j}}}\right)_{i,j=1}^{n}.

Or, in indices,

{\frac {\partial }{\partial y^{i}}}=\sum _{j=1}^{n}{\frac {\partial x^{j}}{\partial y^{i}}}{\frac {\partial }{\partial x^{j}}}.

A tangent vector is by definition a vector that is a linear combination of the coordinate partials $\partial /\partial x^{i}$ . Thus a tangent vector is defined by

v=\sum _{i=1}^{n}v^{i}[\mathbf {f} ]X_{i}=\mathbf {f} \ \mathbf {v} [\mathbf {f} ].

Such a vector is contravariant with respect to change of frame. Under changes in the coordinate system, one has

\mathbf {v} [\mathbf {f} ']=\mathbf {v} [\mathbf {f} J^{-1}]=J\,\mathbf {v} [\mathbf {f} ].

Therefore the components of a tangent vector transform via

v^{i}[\mathbf {f} ']=\sum _{j=1}^{n}{\frac {\partial y^{i}}{\partial x^{j}}}v^{j}[\mathbf {f} ].

Accordingly, a system of n quantities vⁱ depending on the coordinates that transform in this way on passing from one coordinate system to another is called a contravariant vector.

Covariant and contravariant components of a vector

In a Euclidean space V, there is little distinction between covariant and contravariant vectors, because the dot product allows for covectors to be identified with vectors. That is, a vector v determines uniquely a covector α via

\alpha (w)=v\cdot w

for all vectors w. Conversely, each covector α determines a unique vector v by this equation. Because of this identification of vectors with covectors, one may speak of the covariant components or contravariant components of a vector.

Given a basis f = (X₁,...,X_n) of V, there is a unique reciprocal basis f^# = (Y¹,...,Yⁿ) of V determined by requiring

Y^{i}\cdot X_{j}=\delta _{j}^{i},

the Kronecker delta. In terms of these bases, any vector v can be written in two ways:

{\begin{aligned}v&=\sum _{i}v^{i}[\mathbf {f} ]X_{i}=\mathbf {f} \,\mathbf {v} [\mathbf {f} ]\\&=\sum _{i}v_{i}[\mathbf {f} ]Y^{i}=\mathbf {f} ^{\sharp }\mathbf {v} ^{\sharp }[\mathbf {f} ].\end{aligned}}

The components vⁱ[f] are the contravariant components of the vector v in the basis f, and the components v_i[f] are the covariant components of v in the basis f. The terminology is justified because under a change of basis,

\mathbf {v} [\mathbf {f} A]=A^{-1}\mathbf {v} [\mathbf {f} ],\quad \mathbf {v} ^{\sharp }[\mathbf {f} A]=A^{T}\mathbf {v} ^{\sharp }[\mathbf {f} ].

The contravariant components of a vector are obtained by projecting onto the coordinate axes. The covariant components are obtained by projecting onto the normal lines to the coordinate hyperplanes.

Euclidean R³

In the Euclidean space R³, the dot product allows for covectors to be identified with vectors.

This fact allows one to determine explicitly the dual basis to a given set of basis vectors e₁, e₂, e₃ of R³ that are not necessarily assumed to be orthogonal nor of unit norm. The contravariant (dual) basis vectors are:

\mathbf {e} ^{1}={\frac {\mathbf {e} _{2}\times \mathbf {e} _{3}}{\mathbf {e} _{1}\cdot (\mathbf {e} _{2}\times \mathbf {e} _{3})}};\qquad \mathbf {e} ^{2}={\frac {\mathbf {e} _{3}\times \mathbf {e} _{1}}{\mathbf {e} _{2}\cdot (\mathbf {e} _{3}\times \mathbf {e} _{1})}};\qquad \mathbf {e} ^{3}={\frac {\mathbf {e} _{1}\times \mathbf {e} _{2}}{\mathbf {e} _{3}\cdot (\mathbf {e} _{1}\times \mathbf {e} _{2})}}.

Even if the e_i and eⁱ are not orthonormal, they are still by this definition mutually dual:

\mathbf {e} ^{i}\cdot \mathbf {e} _{j}=\delta _{j}^{i}.

Then the contravariant coordinates of any vector v can be obtained by the dot product of v with the contravariant basis vectors:

q^{1}=\mathbf {v} \cdot \mathbf {e} ^{1};\qquad q^{2}=\mathbf {v} \cdot \mathbf {e} ^{2};\qquad q^{3}=\mathbf {v} \cdot \mathbf {e} ^{3}.

Likewise, the covariant components of v can be obtained from the dot product of v with covariant basis vectors, viz.

q_{1}=\mathbf {v} \cdot \mathbf {e} _{1};\qquad q_{2}=\mathbf {v} \cdot \mathbf {e} _{2};\qquad q_{3}=\mathbf {v} \cdot \mathbf {e} _{3}.

Then v can be expressed in two (reciprocal) ways, viz.

\mathbf {v} =q_{i}\mathbf {e} ^{i}=q_{1}\mathbf {e} ^{1}+q_{2}\mathbf {e} ^{2}+q_{3}\mathbf {e} ^{3}

or

\mathbf {v} =q^{i}\mathbf {e} _{i}=q^{1}\mathbf {e} _{1}+q^{2}\mathbf {e} _{2}+q^{3}\mathbf {e} _{3}.

Combining the above relations, we have

\mathbf {v} =(\mathbf {v} \cdot \mathbf {e} _{i})\mathbf {e} ^{i}=(\mathbf {v} \cdot \mathbf {e} ^{i})\mathbf {e} _{i}

and we can convert from covariant to contravariant basis with

q_{i}=\mathbf {v} \cdot \mathbf {e} _{i}=(q^{j}\mathbf {e} _{j})\cdot \mathbf {e} _{i}=(\mathbf {e} _{j}\cdot \mathbf {e} _{i})q^{j}

and

q^{i}=\mathbf {v} \cdot \mathbf {e} ^{i}=(q_{j}\mathbf {e} ^{j})\cdot \mathbf {e} ^{i}=(\mathbf {e} ^{j}\cdot \mathbf {e} ^{i})q_{j}.

The indices of covariant coordinates, vectors, and tensors are subscripts. If the contravariant basis vectors are orthonormal then they are equivalent to the covariant basis vectors, so there is no need to distinguish between the covariant and contravariant coordinates, and all indices are subscripts.

Informal usage

In the field of physics, the adjective covariant is often used informally as a synonym for invariant. For example, the Schrödinger equation does not keep its written form under the coordinate transformations of special relativity. Thus, a physicist might say that the Schrödinger equation is not covariant. In contrast, the Klein-Gordon equation and the Dirac equation do keep their written form under these coordinate transformations. Thus, a physicist might say that these equations are covariant.

Despite the dominant usage of "covariant", it is more accurate to say that the Klein-Gordon and Dirac equations are invariant, and that the Schrödinger equation is not invariant. Additionally, to remove ambiguity, the transformation by which the invariance is evaluated should be indicated. Continuing with the above example, neither the Klein-Gordon nor the Dirac equations are universally invariant under any coordinate transformation (e.g. those of general relativity), so unambiguous description of these equations is that they are invariant with respect to the coordinate transformations of special relativity.

Because the components of vectors are contravariant and those of covectors are covariant, the vectors themselves are often referred to as being contravariant and the covectors as covariant. This usage is not universal, however, since vectors push forward – are covariant under diffeomorphism – and covectors pull back – are contravariant under diffeomorphism. See Einstein notation for details.

Use in tensor analysis

The distinction between covariance and contravariance is particularly important for computations with tensors, which often have mixed variance. This means that they have both covariant and contravariant components, or both vector and dual vector components. The valence of a tensor is the number of variant and covariant terms, and in Einstein notation, covariant components have lower indices, while contravariant components have upper indices. The duality between covariance and contravariance intervenes whenever a vector or tensor quantity is represented by its components, although modern differential geometry uses more sophisticated index-free methods to represent tensors.

In tensor analysis, a covariant vector varies more or less reciprocally to a corresponding contravariant vector. Expressions for lengths, areas and volumes of objects in the vector space can then be given in terms of tensors with covariant and contravariant indices. Under simple expansions and contractions of the coordinates, the reciprocity is exact; under affine transformations the components of a vector intermingle on going between covariant and contravariant expression.

On a manifold, a tensor field will typically have multiple indices, of two sorts. By a widely followed convention, covariant indices are written as lower indices, whereas contravariant indices are upper indices. When the manifold is equipped with a metric, covariant and contravariant indices become very closely related to one-another. Contravariant indices can be turned into covariant indices by contracting with the metric tensor. Contravariant indices can be gotten by contracting with the (matrix) inverse of the metric tensor. Note that in general, no such relation exists in spaces not endowed with a metric tensor. Furthermore, from a more abstract standpoint, a tensor is simply "there" and its components of either kind are only calculational artifacts whose values depend on the chosen coordinates.

The explanation in geometric terms is that a general tensor will have contravariant indices as well as covariant indices, because it has parts that live in the tangent bundle as well as the cotangent bundle.

A contravariant vector is one which transforms like ${\frac {dx^{\mu }}{d\tau }}$ , where $x^{\mu }\!$ are the coordinates of a particle at its proper time $\tau \!$ . A covariant vector is one which transforms like ${\frac {\partial \phi }{\partial x^{\mu }}}$ , where $\phi \!$ is a scalar field.

Algebra and geometry

In category theory, there are covariant functors and contravariant functors. The dual space of a vector space is a standard example of a contravariant functor. Some constructions of multilinear algebra are of 'mixed' variance, which prevents them from being functors.

In geometry, the same map in/map out distinction is helpful in assessing the variance of constructions. A tangent vector to a smooth manifold M is, to begin with, a curve mapping smoothly into M and passing through a given point P. It is therefore covariant, with respect to smooth mappings of M. A contravariant vector, or 1-form, is in the same way constructed from a smooth mapping from M to the real line, near P. It is in the cotangent bundle, built up from the dual spaces of the tangent spaces. Its components with respect to a local basis of one-forms dx_i will be covariant; but one-forms and differential forms in general are contravariant, in the sense that they pull back under smooth mappings. This is crucial to how they are applied; for example a differential form can be restricted to any submanifold, while this does not make the same sense for a field of tangent vectors.

Covariant and contravariant components transform in different ways under coordinate transformations. By considering a coordinate transformation on a manifold as a map from the manifold to itself, the transformation of covariant indices of a tensor are given by a pullback, and the transformation properties of the contravariant indices is given by a pushforward.

Notes

^ A basis f may here profitably be viewed as a linear isomorphism from Rⁿ to V. Regarding f as a row vector whose entries are the elements of the basis, the associated linear isomorphism is then $\mathbf {x} \mapsto \mathbf {f} \mathbf {x} .$

References

Arfken, George B.; Weber, Hans J. (2005)), Mathematical Methods for Physicists (6th ed.), San Diego: Harcourt, ISBN 0-12-059876-0 {{citation}}: Check date values in: |year= (help)CS1 maint: year (link).
Dodson, C. T. J.; Poston, T. (1991), Tensor geometry, Graduate Texts in Mathematics, vol. 130 (2nd ed.), Berlin, New York: Springer-Verlag, ISBN 978-3-540-52018-4, MR1223091.
Greub, Werner Hildbert (1967), Multilinear algebra, Die Grundlehren der Mathematischen Wissenschaften, Band 136, Springer-Verlag New York, Inc., New York, MR0224623.
Sternberg, Shlomo (1983), Lectures on differential geometry, New York: Chelsea, ISBN 978-0-8284-0316-0.
Sylvester, J.J. (1853), "On a Theory of the Syzygetic Relations of Two Rational Integral Functions, Comprising an Application to the Theory of Sturm's Functions, and That of the Greatest Algebraical Common Measure", Philosophical Transactions of the Royal Society of London, 143: 407–548.

External links

Weisstein, Eric W. "Covariant Tensor". MathWorld.

[1] A basis f may here profitably be viewed as a linear isomorphism from Rⁿ to V. Regarding f as a row vector whose entries are the elements of the basis, the associated linear isomorphism is then $\mathbf {x} \mapsto \mathbf {f} \mathbf {x} .$

[1]