= Connection (mathematics) =

In geometry, the notion of a connection makes precise the idea of transporting local geometric objects, such as tangent vectors or tensors in the tangent space, along a curve or family of curves in a parallel and consistent manner. There are various kinds of connections in modern geometry, depending on what sort of data one wants to transport. For instance, an affine connection, the most elementary type of connection, gives a means for parallel transport of tangent vectors on a manifold from one point to another along a curve. An affine connection is typically given in the form of a covariant derivative, which gives a means for taking directional derivatives of vector fields, measuring the deviation of a vector field from being parallel in a given direction.

Connections are of central importance in modern geometry in large part because they allow a comparison between the local geometry at one point and the local geometry at another point. Differential geometry embraces several variations on the connection theme, which fall into two major groups: the infinitesimal and the local theory. The local theory concerns itself primarily with notions of parallel transport and holonomy. The infinitesimal theory concerns itself with the differentiation of geometric data. Thus a covariant derivative is a way of specifying a derivative of a vector field along another vector field on a manifold. A Cartan connection is a way of formulating some aspects of connection theory using differential forms and Lie groups. An Ehresmann connection is a connection in a fibre bundle or a principal bundle by specifying the allowed directions of motion of the field. A Koszul connection is a connection which defines directional derivative for sections of a vector bundle more general than the tangent bundle.

Connections also lead to convenient formulations of geometric invariants, such as the curvature (see also curvature tensor and curvature form), and torsion tensor.

==Motivation: the unsuitability of coordinates==

Consider the following problem. Suppose that a tangent vector to the sphere S is given at the north pole, and we are to define a manner of consistently moving this vector to other points of the sphere: a means for parallel transport. Naively, this could be done using a particular coordinate system. However, unless proper care is applied, the parallel transport defined in one system of coordinates will not agree with that of another coordinate system. A more appropriate parallel transportation system exploits the symmetry of the sphere under rotation. Given a vector at the north pole, one can transport this vector along a curve by rotating the sphere in such a way that the north pole moves along the curve without axial rolling. This latter means of parallel transport is the Levi-Civita connection on the sphere. If two different curves are given with the same initial and terminal point, and a vector v is rigidly moved along the first curve by a rotation, the resulting vector at the terminal point will be different from the vector resulting from rigidly moving v along the second curve. This phenomenon reflects the curvature of the sphere. A simple mechanical device that can be used to visualize parallel transport is the south-pointing chariot.

For instance, suppose that S is a sphere given coordinates by the stereographic projection. Regard S as consisting of unit vectors in R^{3}. Then S carries a pair of coordinate patches corresponding to the projections from north pole and south pole. The mappings
$\begin{align}
\varphi_0(x,y) & = \left(\frac{2x}{1+x^2+y^2}, \frac{2y}{1+x^2+y^2}, \frac{1-x^2-y^2}{1+x^2+y^2}\right)\\[8pt]
\varphi_1(x,y) & = \left(\frac{2x}{1+x^2+y^2}, \frac{2y}{1+x^2+y^2}, \frac{x^2+y^2-1}{1+x^2+y^2}\right)
\end{align}$
cover a neighborhood U_{0} of the north pole and U_{1} of the south pole, respectively. Let X, Y, Z be the ambient coordinates in R^{3}. Then φ_{0} and φ_{1} have inverses
$\begin{align}
\varphi_0^{-1}(X,Y,Z) &= \left(\frac{X}{Z+1}, \frac{Y}{Z+1}\right), \\[8pt]
\varphi_1^{-1}(X,Y,Z) &= \left(\frac{-X}{Z-1}, \frac{-Y}{Z-1}\right),
\end{align}$
so that the coordinate transition function is inversion in the circle:

$\varphi_{01}(x,y) = \varphi_0^{-1}\circ\varphi_1(x,y) = \left(\frac{x}{x^2+y^2},\frac{y}{x^2+y^2}\right)$

Let us now represent a vector field $v$ on S (an assignment of a tangent vector to each point in S) in local coordinates. If P is a point of U_{0} ⊂ S, then a vector field may be represented by the pushforward of a vector field v_{0} on R^{2} by $\varphi_0$:

where $J_{\varphi_0}$ denotes the Jacobian matrix of φ_{0} ($d{\varphi_0}_x({\mathbf u}) = J_{\varphi_0}(x)\cdot {\mathbf u}$), and v_{0} = v_{0}(x, y) is a vector field on R^{2} uniquely determined by v (since the pushforward of a local diffeomorphism at any point is invertible). Furthermore, on the overlap between the coordinate charts U_{0} ∩ U_{1}, it is possible to represent the same vector field with respect to the φ_{1} coordinates:

To relate the components v_{0} and v_{1}, apply the chain rule to the identity φ_{1} = φ_{0} o φ_{01}:

$J_{\varphi_1}\left(\varphi_1^{-1}(P)\right) = J_{\varphi_0}\left(\varphi_0^{-1}(P)\right) \cdot J_{\varphi_{01}}\left(\varphi_1^{-1}(P)\right).$

Applying both sides of this matrix equation to the component vector v_{1}(φ_{1}^{−1}(P)) and invoking () and () yields
\left(\varphi_1^{-1}(P)\right) \cdot {\mathbf v}_1 \left(\varphi_1^{-1}(P)\right).</math>|}}

We come now to the main question of defining how to transport a vector field parallelly along a curve. Suppose that P(t) is a curve in S. Naïvely, one may consider a vector field parallel if the coordinate components of the vector field are constant along the curve. However, an immediate ambiguity arises: in which coordinate system should these components be constant?

For instance, suppose that v(P(t)) has constant components in the U_{1} coordinate system. That is, the functions v_{1}(φ_{1}^{−1}(P(t))) are constant. However, applying the product rule to () and using the fact that dv_{1}/dt = 0 gives

$\frac{d}{dt}{\mathbf v}_0\left(\varphi_0^{-1}(P(t))\right) = \left(\frac{d}{dt}J_{\varphi_{01}}\left(\varphi_1^{-1}(P(t))\right)\right) \cdot {\mathbf v}_1\left(\varphi_1^{-1}\left(P(t)\right)\right).$

But $\left(\frac{d}{dt}J_{\varphi_{01}}\left(\varphi_1^{-1}(P(t))\right)\right)$ is always a non-singular matrix (provided that the curve P(t) is not stationary), so v_{1} and v_{0} cannot ever be simultaneously constant along the curve.

===Resolution===
The problem observed above is that the usual directional derivative of vector calculus does not behave well under changes in the coordinate system when applied to the components of vector fields. This makes it quite difficult to describe how to translate vector fields in a parallel manner, if indeed such a notion makes any sense at all. There are two fundamentally different ways of resolving this problem.

The first approach is to examine what is required for a generalization of the directional derivative to "behave well" under coordinate transitions. This is the tactic taken by the covariant derivative approach to connections: good behavior is equated with covariance. Here one considers a modification of the directional derivative by a certain linear operator, whose components are called the Christoffel symbols, which involves no derivatives on the vector field itself. The directional derivative D_{u}v of the components of a vector v in a coordinate system φ in the direction u are replaced by a covariant derivative:

<math>\nabla_{\mathbf u} {\mathbf v} = D_{\mathbf u} {\mathbf v} + \Gamma(\varphi)\
