# Symmetry of second derivatives

In mathematics, the symmetry of second derivatives (also called the equality of mixed partials) refers to the possibility under certain conditions (see below) of interchanging the order of taking partial derivatives of a function

${\displaystyle f\left(x_{1},\,x_{2},\,\ldots ,\,x_{n}\right)}$

of n variables. If the partial derivative with respect to ${\displaystyle x_{i}}$ is denoted with a subscript ${\displaystyle i}$, then the symmetry is the assertion that the second-order partial derivatives ${\displaystyle f_{ij}}$ satisfy the identity

${\displaystyle f_{ij}=f_{ji}}$

so that they form an n × n symmetric matrix. This is sometimes known as Schwarz's theorem, Clairaut's theorem, or Young's theorem.[1][2]

In the context of partial differential equations it is called the Schwarz integrability condition.

## Hessian matrix

This matrix of second-order partial derivatives of f is called the Hessian matrix of f. The entries in it off the main diagonal are the mixed derivatives; that is, successive partial derivatives with respect to different variables.

In most "real-life" circumstances the Hessian matrix is symmetric, although there are many functions that do not have this property. Mathematical analysis reveals that symmetry requires a hypothesis on f that goes further than simply stating the existence of the second derivatives at a particular point. Schwarz' theorem gives a sufficient condition on f for this to occur.

## Formal expressions of symmetry

In symbols, the symmetry may be expressed as:

${\displaystyle {\frac {\partial }{\partial x}}\left({\frac {\partial f}{\partial y}}\right)\ =\ {\frac {\partial }{\partial y}}\left({\frac {\partial f}{\partial x}}\right)\qquad {\text{or}}\qquad {\frac {\partial ^{2}\!f}{\partial x\,\partial y}}\ =\ {\frac {\partial ^{2}\!f}{\partial y\,\partial x}}.}$

Another notation is:

${\displaystyle \partial _{xy}f=\partial _{yx}f.}$

In terms of composition of the differential operator Di which takes the partial derivative with respect to xi:

${\displaystyle D_{i}\circ D_{j}=D_{j}\circ D_{i}}$.

From this relation it follows that the ring of differential operators with constant coefficients, generated by the Di, is commutative; but this is only true as operators over a domain of sufficiently differentiable functions. It is easy to check the symmetry as applied to monomials, so that one can take polynomials in the xi as a domain. In fact smooth functions are another valid domain.

## Schwarz's theorem

In mathematical analysis, Schwarz's theorem (or Clairaut's theorem on equality of mixed partials)[3] named after Alexis Clairaut and Hermann Schwarz, states that if ${\displaystyle \left(a_{1},\,\ldots ,\,a_{n}\right)\in \mathbb {R} ^{n}}$, ${\displaystyle \Omega \subseteq \mathbb {R} ^{n}}$, some neighborhood of ${\displaystyle \left(a_{1},\,\ldots ,\,a_{n}\right)}$ is contained in ${\displaystyle \Omega }$,

${\displaystyle f\colon \Omega \to \mathbb {R} }$

and ${\displaystyle f}$ has continuous second partial derivatives at the point in ${\displaystyle \left(a_{1},\,\ldots ,\,a_{n}\right)}$, then ${\displaystyle \forall i,j\in \{1,\,2,\,\ldots ,\,n\},}$

${\displaystyle {\frac {\partial ^{2}}{\partial x_{i}\,\partial x_{j}}}f\left(a_{1},\,\ldots ,\,a_{n}\right)={\frac {\partial ^{2}}{\partial x_{j}\,\partial x_{i}}}f\left(a_{1},\,\ldots ,\,a_{n}\right).}$

The partial derivatives of this function are commutative at that point. One easy way to establish this theorem (in the case where ${\displaystyle n=2}$, ${\displaystyle i=1}$, and ${\displaystyle j=2}$, which readily entails the result in general) is by applying Green's theorem to the gradient of ${\displaystyle f.}$ An elementary proof is as follows. Let

{\displaystyle {\begin{aligned}u\left(h_{1},\,h_{2}\right)&=f\left(a_{1}+h_{1},\,a_{2}+h_{2}\right)-f\left(a_{1}+h_{1},\,a_{2}\right),\\v\left(h_{1},\,h_{2}\right)&=f\left(a_{1}+h_{1},\,a_{2}+h_{2}\right)-f\left(a_{1},\,a_{2}+h_{2}\right),\\w\left(h_{1},\,h_{2}\right)&=f\left(a_{1}+h_{1},\,a_{2}+h_{2}\right)-f\left(a_{1}+h_{1},\,a_{2}\right)-f\left(a_{1},\,a_{2}+h_{2}\right)+f\left(a_{1},\,a_{2}\right).\end{aligned}}}

These functions are defined for ${\displaystyle \left|h_{1}\right|,\,\left|h_{2}\right|<\varepsilon }$, where ${\displaystyle \varepsilon }$ is chosen so that holds ${\displaystyle \varepsilon >0}$ and ${\displaystyle \left[a_{1}-\varepsilon ,\,a_{1}+\varepsilon \right]\times \left[a_{2}-\varepsilon ,\,a_{2}+\varepsilon \right]\subseteq \Omega }$.

By mean value theorem, for ${\displaystyle h_{1},\,h_{2}\neq 0}$ holds

{\displaystyle {\begin{aligned}w\left(h_{1},\,h_{2}\right)&=u\left(h_{1},\,h_{2}\right)-u\left(0,\,h_{2}\right)=h_{1}{\frac {\partial }{\partial x}}u\left(\theta _{1}\left(h_{1},\,h_{2}\right)h_{1},\,h_{2}\right)\\&=h_{1}\left({\frac {\partial }{\partial x}}f\left(a_{1}+\theta _{1}\left(h_{1},\,h_{2}\right)h_{1},\,a_{2}+h_{2}\right)-{\frac {\partial }{\partial x}}f\left(a_{1}+\theta _{1}\left(h_{1},\,h_{2}\right)h_{1},\,a_{2}\right)\right)\\&=h_{1}h_{2}{\frac {\partial ^{2}}{\partial x\partial y}}f\left(a_{1}+\theta _{1}\left(h_{1},\,h_{2}\right)h_{1},\,a_{2}+\theta _{2}\left(h_{1},\,h_{2}\right)h_{2}\right)\\w\left(h_{1},\,h_{2}\right)&=v\left(h_{1},\,h_{2}\right)-v\left(h_{1},\,0\right)=h_{2}{\frac {\partial }{\partial y}}v\left(h_{1},\,\theta _{3}\left(h_{1},\,h_{2}\right)h_{2}\right)\\&=h_{2}\left({\frac {\partial }{\partial y}}f\left(a_{1}+h_{1},\,a_{2}+\theta _{3}\left(h_{1},\,h_{2}\right)h_{2}\right)-{\frac {\partial }{\partial x}}f\left(a_{1},\,a_{2}+\theta _{3}\left(h_{1},\,h_{2}\right)h_{2}\right)\right)\\&=h_{1}h_{2}{\frac {\partial ^{2}f}{\partial y\partial x}}\left(a_{1}+\theta _{4}\left(h_{1},\,h_{2}\right)h_{1},\,a_{2}+\theta _{3}\left(h_{1},\,h_{2}\right)h_{2}\right)\end{aligned}}}

for some functions ${\displaystyle \theta _{i}\left(h_{1},\,h_{2}\right)}$ such that ${\displaystyle 0<\theta _{i}\left(h_{1},\,h_{2}\right)<1}$. The special case is that for ${\displaystyle h\neq 0}$ holds

{\displaystyle {\begin{aligned}h^{2}{\frac {\partial ^{2}}{\partial x\partial y}}f\left(a_{1}+\theta _{1}\left(h,\,h\right)h,\,a_{2}+\theta _{2}\left(h,\,h\right)h\right)&=h^{2}{\frac {\partial ^{2}}{\partial y\partial x}}f\left(a_{1}+\theta _{4}\left(h,\,h\right)h,\,a_{2}+\theta _{3}\left(h,\,h\right)h\right),\\{\frac {\partial ^{2}}{\partial x\partial y}}f\left(a_{1}+\theta _{1}\left(h,\,h\right)h,\,a_{2}+\theta _{2}\left(h,\,h\right)h\right)&={\frac {\partial ^{2}}{\partial y\partial x}}f\left(a_{1}+\theta _{4}\left(h,\,h\right)h,\,a_{2}+\theta _{3}\left(h,\,h\right)h\right).\end{aligned}}}

Letting that ${\displaystyle h}$ tends to zero in the last equality, we conclude that

${\displaystyle {\frac {\partial ^{2}}{\partial x\partial y}}f\left(a_{1},\,a_{2}\right)={\frac {\partial ^{2}}{\partial y\partial x}}f\left(a_{1},\,a_{2}\right).}$

## Sufficiency of twice-differentiability

A weaker condition than the continuity of second partial derivatives (which is implied by the latter) which suffices to ensure symmetry is that all partial derivatives are themselves differentiable.[4] Another strengthening of the theorem, in which existence of the permuted mixed partial is asserted, was provided by Peano in a short 1890 note on Mathesis:

If ${\displaystyle f:E\to \mathbb {R} }$ is defined on an open set ${\displaystyle E\subset \mathbb {R} ^{2}}$; ${\displaystyle \partial _{1}f(x,\,y)}$ and ${\displaystyle \partial _{2,1}f(x,\,y)}$ exist everywhere on ${\displaystyle E}$; ${\displaystyle \partial _{2,1}f}$ is continuous at ${\displaystyle \left(x_{0},\,y_{0}\right)\in E}$, and if ${\displaystyle \partial _{2}f(x,\,y_{0})}$ exists in a neighborhood of ${\displaystyle x=x_{0}}$, then ${\displaystyle \partial _{1,2}f}$ exists at ${\displaystyle \left(x_{0},\,y_{0}\right)}$ and ${\displaystyle \partial _{1,2}f\left(x_{0},\,y_{0}\right)=\partial _{2,1}f\left(x_{0},\,y_{0}\right)}$.[5]

## History

The result of the equality of the mixed partial derivatives under certain conditions has a long history. Nicolaus I Bernoulli implicitly assumed the result as early as 1721, but Euler was the first to provide a proof. Other proofs followed by Clairaut (1740), Lagrange (1797), Cauchy (1823) and many others in the 19th century. None of these proofs were without fault however (for example, Clairaut assumed all definite integrals could be differentiated under the integral sign). In 1867 Ernst Leonard Lindelöf published a paper[6] criticizing in detail all the proofs he was familiar with. Finally, six years later Hermann Schwarz (1873) gave the first satisfactory proof. This was followed by successive refinements that relaxed the hypotheses in Schwarz's theorem in various ways, among others by Dini, Jordan, Peano, E. W. Hobson, W. H. Young. For a good historical account, see [7].

## Distribution theory formulation

The theory of distributions (generalized functions) eliminates analytic problems with the symmetry. The derivative of an integrable function can always be defined as a distribution, and symmetry of mixed partial derivatives always holds as an equality of distributions. The use of formal integration by parts to define differentiation of distributions puts the symmetry question back onto the test functions, which are smooth and certainly satisfy this symmetry. In more detail (where f is a distribution, written as an operator on test functions, and φ is a test function),

${\displaystyle \left(D_{1}D_{2}f\right)[\phi ]=-\left(D_{2}f\right)\left[D_{1}\phi \right]=f\left[D_{2}D_{1}\phi \right]=f\left[D_{1}D_{2}\phi \right]=-\left(D_{1}f\right)\left[D_{2}\phi \right]=\left(D_{2}D_{1}f\right)[\phi ].}$

Another approach, which defines the Fourier transform of a function, is to note that on such transforms partial derivatives become multiplication operators that commute much more obviously.

## Requirement of continuity

The symmetry may be broken if the function fails to have differentiable partial derivatives, which is possible if Clairaut's theorem is not satisfied (the second partial derivatives are not continuous).

The function f(x, y), as shown in equation (1), does not have symmetric second derivatives at its origin.

An example of non-symmetry is the function

${\displaystyle f(x,\,y)={\begin{cases}{\frac {xy\left(x^{2}-y^{2}\right)}{x^{2}+y^{2}}}&{\mbox{ for }}(x,\,y)\neq (0,\,0)\\0&{\mbox{ for }}(x,\,y)=(0,\,0).\end{cases}}}$

(1)

This can be visualized by the polar form ${\displaystyle f(r\cos(\theta ),r\sin(\theta ))=r^{2}\sin(4\theta )}$; it is everywhere continuous, but its derivatives at (0, 0) cannot be computed algebraically. Rather, the limit of difference quotients shows that ${\displaystyle \left.\partial _{x}f\right|_{(0,0)}=\left.\partial _{y}f\right|_{(0,0)}=0}$, so the graph z = f(x, y) has a horizontal tangent plane at (0, 0), and the partial derivatives ${\displaystyle \partial _{x}f,\partial _{y}f}$ exist and are everywhere continuous. However, the second partial derivatives are not continuous at (0, 0), and the symmetry fails. In fact, along the x-axis the y-derivative is ${\displaystyle \left.\partial _{y}f\right|_{(x,0)}=x}$, and so:

${\displaystyle \left.\partial _{x}\partial _{y}f\right|_{(0,0)}=\lim _{\varepsilon \rightarrow 0}{\frac {\left.\partial _{y}f\right|_{(\varepsilon ,0)}-\left.\partial _{y}f\right|_{(0,0)}}{\varepsilon }}=1.}$

In contrast, along the y-axis the x-derivative ${\displaystyle \left.\partial _{x}f\right|_{(0,y)}=-y}$, and so ${\displaystyle \left.\partial _{y}\partial _{x}f\right|_{(0,0)}=-1}$. That is, ${\displaystyle \partial _{xy}f\neq \partial _{yx}f}$ at (0, 0), although the mixed partial derivatives do exist, and at every other point the symmetry does hold.

The above function, written in a cylindrical coordinate system, can be expressed as

${\displaystyle f(r,\,\theta )={\frac {r^{2}\sin {4\theta }}{4}},}$

showing that the function oscillates four times when traveling once around an arbitrarily small loop containing the origin. Intuitively, therefore, the local behavior of the function at ${\displaystyle (0,\,0)}$ cannot be described as a quadratic form, and the Hessian matrix thus fails to be symmetric.

In general, the interchange of limiting operations need not commute. Given two variables near (0, 0) and two limiting processes on

${\displaystyle f(h,\,k)-f(h,\,0)-f(0,\,k)+f(0,\,0)}$

corresponding to making h → 0 first, and to making k → 0 first. It can matter, looking at the first-order terms, which is applied first. This leads to the construction of pathological examples in which second derivatives are non-symmetric. This kind of example belongs to the theory of real analysis where the pointwise value of functions matters. When viewed as a distribution the second partial derivative's values can be changed at an arbitrary set of points as long as this has Lebesgue measure 0. Since in the example the Hessian is symmetric everywhere except (0, 0), there is no contradiction with the fact that the Hessian, viewed as a Schwartz distribution, is symmetric.

## In Lie theory

Consider the first-order differential operators Di to be infinitesimal operators on Euclidean space. That is, Di in a sense generates the one-parameter group of translations parallel to the xi-axis. These groups commute with each other, and therefore the infinitesimal generators do also; the Lie bracket

[Di, Dj] = 0

is this property's reflection. In other words, the Lie derivative of one coordinate with respect to another is zero.

## Application to differential forms

The Clairaut-Schwarz theorem is the key fact needed to prove that for every ${\displaystyle C^{\infty }}$ (or at least twice differentiable) differential form ${\displaystyle \omega \in \Omega ^{k}(M)}$, the second exterior derivative vanishes: ${\displaystyle d^{2}\omega :=d(d\omega )=0}$. This implies that every differentiable exact form (i.e., a form ${\displaystyle \alpha }$ such that ${\displaystyle \alpha =d\omega }$ for some form ${\displaystyle \omega }$) is closed (i.e., ${\displaystyle d\alpha =0}$), since ${\displaystyle d\alpha =d(d\omega )=0}$.[8]

## References

1. ^ "Young's Theorem" (PDF). Archived from the original (PDF) on May 18, 2006. Retrieved 2015-01-02.
2. ^ Allen, R. G. D. (1964). Mathematical Analysis for Economists. New York: St. Martin's Press. pp. 300–305.
6. ^ Lindelöf, Ernst Leonard (1867). "Remarques sur les différentes manières d'établir la formule ${\displaystyle {\frac {d^{2}z}{dxdy}}={\frac {d^{2}z}{dydx}}}$". Acta Societatis Scientiarum Fennicae. 8, part 1: 205–213.