First class constraint

In physics, a first class constraint is a dynamical quantity in a constrained Hamiltonian system whose Poisson bracket with all the other constraints vanishes on the constraint surface in phase space (the surface implicitly defined by the simultaneous vanishing of all the constraints). To calculate the first class constraint, one assumes that there are no second class constraints, or that they have been calculated previously, and their Dirac brackets generated.^[1]

First and second class constraints were introduced by Dirac (1950, p.136, 1964, p.17) as a way of quantizing mechanical systems such as gauge theories where the symplectic form is degenerate.^[2]^[3]

The terminology of first and second class constraints is confusingly similar to that of primary and secondary constraints, reflecting the manner in which these are generated. These divisions are independent: both first and second class constraints can be either primary or secondary, so this gives altogether four different classes of constraints.

Poisson brackets[edit]

Consider a Poisson manifold M with a smooth Hamiltonian over it (for field theories, M would be infinite-dimensional).

Suppose we have some constraints

f_{i}(x)=0,

for n smooth functions

\{f_{i}\}_{i=1}^{n}

These will only be defined chartwise in general. Suppose that everywhere on the constrained set, the n derivatives of the n functions are all linearly independent and also that the Poisson brackets

\{f_{i},f_{j}\}

and

\{f_{i},H\}

all vanish on the constrained subspace.

This means we can write

\{f_{i},f_{j}\}=\sum _{k}c_{ij}^{k}f_{k}

for some smooth functions $c_{ij}^{k}$ −−there is a theorem showing this; and

\{f_{i},H\}=\sum _{j}v_{i}^{j}f_{j}

for some smooth functions $v_{i}^{j}$ .

This can be done globally, using a partition of unity. Then, we say we have an irreducible first-class constraint (irreducible here is in a different sense from that used in representation theory).

Geometric theory[edit]

For a more elegant way, suppose given a vector bundle over ${\mathcal {M}}$ , with $n$ -dimensional fiber $V$ . Equip this vector bundle with a connection. Suppose too we have a smooth section $f$ of this bundle.

Then the covariant derivative of $f$ with respect to the connection is a smooth linear map $\nabla f$ from the tangent bundle $T{\mathcal {M}}$ to $V$ , which preserves the base point. Assume this linear map is right invertible (i.e. there exists a linear map $g$ such that $(\Delta f)g$ is the identity map) for all the fibers at the zeros of $f$ . Then, according to the implicit function theorem, the subspace of zeros of $f$ is a submanifold.

The ordinary Poisson bracket is only defined over $C^{\infty }(M)$ , the space of smooth functions over M. However, using the connection, we can extend it to the space of smooth sections of $f$ if we work with the algebra bundle with the graded algebra of V-tensors as fibers.

Assume also that under this Poisson bracket, $\{f,f\}=0$ (note that it's not true that $\{g,g\}=0$ in general for this "extended Poisson bracket" anymore) and $\{f,H\}=0$ on the submanifold of zeros of $f$ (If these brackets also happen to be zero everywhere, then we say the constraints close off shell). It turns out the right invertibility condition and the commutativity of flows conditions are independent of the choice of connection. So, we can drop the connection provided we are working solely with the restricted subspace.

Intuitive meaning[edit]

What does it all mean intuitively? It means the Hamiltonian and constraint flows all commute with each other on the constrained subspace; or alternatively, that if we start on a point on the constrained subspace, then the Hamiltonian and constraint flows all bring the point to another point on the constrained subspace.

Since we wish to restrict ourselves to the constrained subspace only, this suggests that the Hamiltonian, or any other physical observable, should only be defined on that subspace. Equivalently, we can look at the equivalence class of smooth functions over the symplectic manifold, which agree on the constrained subspace (the quotient algebra by the ideal generated by the $f$ 's, in other words).

The catch is, the Hamiltonian flows on the constrained subspace depend on the gradient of the Hamiltonian there, not its value. But there's an easy way out of this.

Look at the orbits of the constrained subspace under the action of the symplectic flows generated by the $f$ 's. This gives a local foliation of the subspace because it satisfies integrability conditions (Frobenius theorem). It turns out if we start with two different points on a same orbit on the constrained subspace and evolve both of them under two different Hamiltonians, respectively, which agree on the constrained subspace, then the time evolution of both points under their respective Hamiltonian flows will always lie in the same orbit at equal times. It also turns out if we have two smooth functions A₁ and B₁, which are constant over orbits at least on the constrained subspace (i.e. physical observables) (i.e. {A₁,f}={B₁,f}=0 over the constrained subspace)and another two A₂ and B₂, which are also constant over orbits such that A₁ and B₁ agrees with A₂ and B₂ respectively over the restrained subspace, then their Poisson brackets {A₁, B₁} and {A₂, B₂} are also constant over orbits and agree over the constrained subspace.

In general, one cannot rule out "ergodic" flows (which basically means that an orbit is dense in some open set), or "subergodic" flows (which an orbit dense in some submanifold of dimension greater than the orbit's dimension). We can't have self-intersecting orbits.

For most "practical" applications of first-class constraints, we do not see such complications: the quotient space of the restricted subspace by the f-flows (in other words, the orbit space) is well behaved enough to act as a differentiable manifold, which can be turned into a symplectic manifold by projecting the symplectic form of M onto it (this can be shown to be well defined). In light of the observation about physical observables mentioned earlier, we can work with this more "physical" smaller symplectic manifold, but with 2n fewer dimensions.

In general, the quotient space is a bit difficult to work with when doing concrete calculations (not to mention nonlocal when working with diffeomorphism constraints), so what is usually done instead is something similar. Note that the restricted submanifold is a bundle (but not a fiber bundle in general) over the quotient manifold. So, instead of working with the quotient manifold, we can work with a section of the bundle instead. This is called gauge fixing.

The major problem is this bundle might not have a global section in general. This is where the "problem" of global anomalies comes in, for example. A global anomaly is different from the Gribov ambiguity, which is when a gauge fixing doesn't work to fix a gauge uniquely, in a global anomaly, there is no consistent definition of the gauge field. A global anomaly is a barrier to defining a quantum gauge theory discovered by Witten in 1980.

What have been described are irreducible first-class constraints. Another complication is that Δf might not be right invertible on subspaces of the restricted submanifold of codimension 1 or greater (which violates the stronger assumption stated earlier in this article). This happens, for example in the cotetrad formulation of general relativity, at the subspace of configurations where the cotetrad field and the connection form happen to be zero over some open subset of space. Here, the constraints are the diffeomorphism constraints.

One way to get around this is this: For reducible constraints, we relax the condition on the right invertibility of Δf into this one: Any smooth function that vanishes at the zeros of f is the fiberwise contraction of f with (a non-unique) smooth section of a ${\bar {V}}$ -vector bundle where ${\bar {V}}$ is the dual vector space to the constraint vector space V. This is called the regularity condition.

Constrained Hamiltonian dynamics from a Lagrangian gauge theory[edit]

First of all, we will assume the action is the integral of a local Lagrangian that only depends up to the first derivative of the fields. The analysis of more general cases, while possible is more complicated. When going over to the Hamiltonian formalism, we find there are constraints. Recall that in the action formalism, there are on shell and off shell configurations. The constraints that hold off shell are called primary constraints while those that only hold on shell are called secondary constraints.

Examples[edit]

Consider the dynamics of a single point particle of mass $m$ with no internal degrees of freedom moving in a pseudo-Riemannian spacetime manifold $S$ with metric g. Assume also that the parameter $τ$ describing the trajectory of the particle is arbitrary (i.e. we insist upon reparametrization invariance). Then, its symplectic space is the cotangent bundle T*S with the canonical symplectic form $ω$ .

If we coordinatize T * S by its position $x$ in the base manifold $S$ and its position within the cotangent space p, then we have a constraint

f = m² −g(x)⁻¹(p,p) = 0 .

The Hamiltonian $H$ is, surprisingly enough, $H$ = 0. In light of the observation that the Hamiltonian is only defined up to the equivalence class of smooth functions agreeing on the constrained subspace, we can use a new Hamiltonian $H$ '= $f$ instead. Then, we have the interesting case where the Hamiltonian is the same as a constraint! See Hamiltonian constraint for more details.

Consider now the case of a Yang–Mills theory for a real simple Lie algebra $L$ (with a negative definite Killing form $η$ ) minimally coupled to a real scalar field $σ$ , which transforms as an orthogonal representation $ρ$ with the underlying vector space $V$ under $L$ in ( $d$ − 1) + 1 Minkowski spacetime. For $l$ in $L$ , we write

ρ(l)[σ]

as

l[σ]

for simplicity. Let A be the $L$ -valued connection form of the theory. Note that the A here differs from the A used by physicists by a factor of $i$ and $g$ . This agrees with the mathematician's convention.

The action $S$ is given by

S[\mathbf {A} ,\sigma ]=\int d^{d}x{\frac {1}{4g^{2}}}\eta ((\mathbf {g} ^{-1}\otimes \mathbf {g} ^{-1})(\mathbf {F} ,\mathbf {F} ))+{\frac {1}{2}}\alpha (\mathbf {g} ^{-1}(D\sigma ,D\sigma ))

where g is the Minkowski metric, F is the curvature form

d\mathbf {A} +\mathbf {A} \wedge \mathbf {A}

(no $i$ s or $g$ s!) where the second term is a formal shorthand for pretending the Lie bracket is a commutator, $D$ is the covariant derivative

Dσ = dσ − A[σ]

and $α$ is the orthogonal form for $ρ$ .

What is the Hamiltonian version of this model? Well, first, we have to split A noncovariantly into a time component $φ$ and a spatial part A→. Then, the resulting symplectic space has the conjugate variables $σ$ , $π σ$ (taking values in the underlying vector space of ${\bar {\rho }}$ , the dual rep of $ρ$ ), A→, π→_A, φ and π_φ. For each spatial point, we have the constraints, π_φ=0 and the Gaussian constraint

{\vec {D}}\cdot {\vec {\pi }}_{A}-\rho '(\pi _{\sigma },\sigma )=0

where since $ρ$ is an intertwiner

\rho :L\otimes V\rightarrow V

,

$ρ$ ' is the dualized intertwiner

\rho ':{\bar {V}}\otimes V\rightarrow L

( $L$ is self-dual via $η$ ). The Hamiltonian,

H_{f}=\int d^{d-1}x{\frac {1}{2}}\alpha ^{-1}(\pi _{\sigma },\pi _{\sigma })+{\frac {1}{2}}\alpha ({\vec {D}}\sigma \cdot {\vec {D}}\sigma )-{\frac {g^{2}}{2}}\eta ({\vec {\pi }}_{A},{\vec {\pi }}_{A})-{\frac {1}{2g^{2}}}\eta (\mathbf {B} \cdot \mathbf {B} )-\eta (\pi _{\phi },f)-<\pi _{\sigma },\phi [\sigma ]>-\eta (\phi ,{\vec {D}}\cdot {\vec {\pi }}_{A}).

The last two terms are a linear combination of the Gaussian constraints and we have a whole family of (gauge equivalent)Hamiltonians parametrized by $f$ . In fact, since the last three terms vanish for the constrained states, we may drop them.

Second class constraints[edit]

In a constrained Hamiltonian system, a dynamical quantity is second class if its Poisson bracket with at least one constraint is nonvanishing. A constraint that has a nonzero Poisson bracket with at least one other constraint, then, is a second class constraint.

See Dirac brackets for diverse illustrations.

An example: a particle confined to a sphere[edit]

Before going on to the general theory, consider a specific example step by step to motivate the general analysis.

Start with the action describing a Newtonian particle of mass $m$ constrained to a spherical surface of radius $R$ within a uniform gravitational field $g$ . When one works in Lagrangian mechanics, there are several ways to implement a constraint: one can switch to generalized coordinates that manifestly solve the constraint, or one can use a Lagrange multiplier while retaining the redundant coordinates so constrained.

In this case, the particle is constrained to a sphere, therefore the natural solution would be to use angular coordinates to describe the position of the particle instead of Cartesian and solve (automatically eliminate) the constraint in that way (the first choice). For pedagogical reasons, instead, consider the problem in (redundant) Cartesian coordinates, with a Lagrange multiplier term enforcing the constraint.

The action is given by

S=\int dtL=\int dt\left[{\frac {m}{2}}({\dot {x}}^{2}+{\dot {y}}^{2}+{\dot {z}}^{2})-mgz+{\frac {\lambda }{2}}(x^{2}+y^{2}+z^{2}-R^{2})\right]

where the last term is the Lagrange multiplier term enforcing the constraint.

Of course, as indicated, we could have just used different, non-redundant, spherical coordinates and written it as

S=\int dt\left[{\frac {mR^{2}}{2}}({\dot {\theta }}^{2}+\sin ^{2}(\theta ){\dot {\phi }}^{2})+mgR\cos(\theta )\right]

instead, without extra constraints; but we are considering the former coordinatization to illustrate constraints.

The conjugate momenta are given by

p_{x}=m{\dot {x}}

,

p_{y}=m{\dot {y}}

,

p_{z}=m{\dot {z}}

,

p_{\lambda }=0

.

Note that we can't determine $• λ$ from the momenta.

The Hamiltonian is given by

H={\vec {p}}\cdot {\dot {\vec {r}}}+p_{\lambda }{\dot {\lambda }}-L={\frac {p^{2}}{2m}}+p_{\lambda }{\dot {\lambda }}+mgz-{\frac {\lambda }{2}}(r^{2}-R^{2})

.

We cannot eliminate •λ at this stage yet. We are here treating •λ as a shorthand for a function of the symplectic space which we have yet to determine and not as an independent variable. For notational consistency, define $u 1 = • λ$ from now on. The above Hamiltonian with the $p λ$ term is the "naive Hamiltonian". Note that since, on-shell, the constraint must be satisfied, one cannot distinguish, on-shell, between the naive Hamiltonian and the above Hamiltonian with the undetermined coefficient, $• λ = u 1$ .

We have the primary constraint

p λ =0

.

We require, on the grounds of consistency, that the Poisson bracket of all the constraints with the Hamiltonian vanish at the constrained subspace. In other words, the constraints must not evolve in time if they are going to be identically zero along the equations of motion.

From this consistency condition, we immediately get the secondary constraint

${\begin{aligned}0&=\{H,p_{\lambda }\}_{\text{PB}}\\&=\sum _{i}{\frac {\partial H}{\partial q_{i}}}{\frac {\partial p_{\lambda }}{\partial p_{i}}}-{\frac {\partial H}{\partial p_{i}}}{\frac {\partial p_{\lambda }}{\partial q_{i}}}\\&={\frac {\partial H}{\partial \lambda }}\\&={\frac {1}{2}}(r^{2}-R^{2})\\&\Downarrow \\0&=r^{2}-R^{2}\end{aligned}}$

This constraint should be added into the Hamiltonian with an undetermined (not necessarily constant) coefficient $u$ _2, enlarging the Hamiltonian to

H={\frac {p^{2}}{2m}}+mgz-{\frac {\lambda }{2}}(r^{2}-R^{2})+u_{1}p_{\lambda }+u_{2}(r^{2}-R^{2})~.

Similarly, from this secondary constraint, we find the tertiary constraint

${\begin{aligned}0&=\{H,r^{2}-R^{2}\}_{PB}\\&=\{H,x^{2}\}_{PB}+\{H,y^{2}\}_{PB}+\{H,z^{2}\}_{PB}\\&={\frac {\partial H}{\partial p_{x}}}2x+{\frac {\partial H}{\partial p_{y}}}2y+{\frac {\partial H}{\partial p_{z}}}2z\\&={\frac {2}{m}}(p_{x}x+p_{y}y+p_{z}z)\\&\Downarrow \\0&={\vec {p}}\cdot {\vec {r}}\end{aligned}}$

Again, one should add this constraint into the Hamiltonian, since, on-shell, no one can tell the difference. Therefore, so far, the Hamiltonian looks like

H={\frac {p^{2}}{2m}}+mgz-{\frac {\lambda }{2}}(r^{2}-R^{2})+u_{1}p_{\lambda }+u_{2}(r^{2}-R^{2})+u_{3}{\vec {p}}\cdot {\vec {r}}~,

where $u$ ₁, $u$ ₂, and $u$ ₃ are still completely undetermined.

Note that, frequently, all constraints that are found from consistency conditions are referred to as secondary constraints and secondary, tertiary, quaternary, etc., constraints are not distinguished.

We keep turning the crank, demanding this new constraint have vanishing Poisson bracket

0=\{{\vec {p}}\cdot {\vec {r}},\,H\}_{PB}={\frac {p^{2}}{m}}-mgz+\lambda r^{2}-2u_{2}r^{2}.

We might despair and think that there is no end to this, but because one of the new Lagrange multipliers has shown up, this is not a new constraint, but a condition that fixes the Lagrange multiplier:

u_{2}={\frac {\lambda }{2}}+{\frac {1}{r^{2}}}\left({\frac {p^{2}}{2m}}-{\frac {1}{2}}mgz\right).

Plugging this into our Hamiltonian gives us (after a little algebra)

$H={\frac {p^{2}}{2m}}(2-{\frac {R^{2}}{r^{2}}})+{\frac {1}{2}}mgz(1+{\frac {R^{2}}{r^{2}}})+u_{1}p_{\lambda }+u_{3}{\vec {p}}\cdot {\vec {r}}$

Now that there are new terms in the Hamiltonian, one should go back and check the consistency conditions for the primary and secondary constraints. The secondary constraint's consistency condition gives

{\frac {2}{m}}{\vec {r}}\cdot {\vec {p}}+2u_{3}r^{2}=0.

Again, this is not a new constraint; it only determines that

u_{3}=-{\frac {{\vec {r}}\cdot {\vec {p}}}{mr^{2}}}~.

At this point there are no more constraints or consistency conditions to check!

Putting it all together,

H=\left(2-{\frac {R^{2}}{r^{2}}}\right){\frac {p^{2}}{2m}}+{\frac {1}{2}}\left(1+{\frac {R^{2}}{r^{2}}}\right)mgz-{\frac {({\vec {r}}\cdot {\vec {p}})^{2}}{mr^{2}}}+u_{1}p_{\lambda }

.

When finding the equations of motion, one should use the above Hamiltonian, and as long as one is careful to never use constraints before taking derivatives in the Poisson bracket then one gets the correct equations of motion. That is, the equations of motion are given by

{\dot {\vec {r}}}=\{{\vec {r}},\,H\}_{PB},\quad {\dot {\vec {p}}}=\{{\vec {p}},\,H\}_{PB},\quad {\dot {\lambda }}=\{\lambda ,\,H\}_{PB},\quad {\dot {p}}_{\lambda }=\{p_{\lambda },H\}_{PB}.

Before analyzing the Hamiltonian, consider the three constraints,

\varphi _{1}=p_{\lambda },\quad \varphi _{2}=r^{2}-R^{2},\quad \varphi _{3}={\vec {p}}\cdot {\vec {r}}.

Note the nontrivial Poisson bracket structure of the constraints. In particular,

\{\varphi _{2},\varphi _{3}\}=2r^{2}\neq 0.

The above Poisson bracket does not just fail to vanish off-shell, which might be anticipated, but even on-shell it is nonzero. Therefore, $φ 2$ and $φ 3$ are second class constraints while $φ 1$ is a first class constraint. Note that these constraints satisfy the regularity condition.

Here, we have a symplectic space where the Poisson bracket does not have "nice properties" on the constrained subspace. However, Dirac noticed that we can turn the underlying differential manifold of the symplectic space into a Poisson manifold using his eponymous modified bracket, called the Dirac bracket, such that this Dirac bracket of any (smooth) function with any of the second class constraints always vanishes.

Effectively, these brackets (illustrated for this spherical surface in the Dirac bracket article) project the system back onto the constraints surface. If one then wished to canonically quantize this system, then one need promote the canonical Dirac brackets,^[4] not the canonical Poisson brackets to commutation relations.

Examination of the above Hamiltonian shows a number of interesting things happening. One thing to note is that, on-shell when the constraints are satisfied, the extended Hamiltonian is identical to the naive Hamiltonian, as required. Also, note that $λ$ dropped out of the extended Hamiltonian. Since $φ 1$ is a first class primary constraint, it should be interpreted as a generator of a gauge transformation. The gauge freedom is the freedom to choose $λ$ , which has ceased to have any effect on the particle's dynamics. Therefore, that $λ$ dropped out of the Hamiltonian, that $u$ ₁ is undetermined, and that $φ 1$ = p_λ is first class, are all closely interrelated.

Note that it would be more natural not to start with a Lagrangian with a Lagrange multiplier, but instead take $r ² - R ²$ as a primary constraint and proceed through the formalism: The result would the elimination of the extraneous $λ$ dynamical quantity. However, the example is more edifying in its current form.

Example: Proca action[edit]

Another example we will use is the Proca action. The fields are $A^{\mu }=({\vec {A}},\phi )$ and the action is

S=\int d^{d}xdt\left[{\frac {1}{2}}E^{2}-{\frac {1}{4}}B_{ij}B_{ij}-{\frac {m^{2}}{2}}A^{2}+{\frac {m^{2}}{2}}\phi ^{2}\right]

where

{\vec {E}}\equiv -\nabla \phi -{\dot {\vec {A}}}

and

B_{ij}\equiv {\frac {\partial A_{j}}{\partial x_{i}}}-{\frac {\partial A_{i}}{\partial x_{j}}}

.

$({\vec {A}},-{\vec {E}})$ and $(\phi ,\pi )$ are canonical variables. The second class constraints are

\pi \approx 0

and

\nabla \cdot {\vec {E}}+m^{2}\phi \approx 0

.