Second partial derivative test

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In mathematics, the second partial derivative test is a method in multivariable calculus used to determine if a critical point of a function is a local minimum, maximum or saddle point.

The test[edit]

Functions of two variables[edit]

Suppose that f(x, y) is a differentiable real function of two variables whose second partial derivatives exist. The Hessian matrix H of f is the 2 × 2 matrix of partial derivatives of f:

H(x,y) = \begin{pmatrix}f_{xx}(x,y) &f_{xy}(x,y)\\f_{yx}(x,y) &f_{yy}(x,y)\end{pmatrix}.

Define D(x, y) to be the determinant

D(x,y)=\det(H(x,y)) = f_{xx}(x,y)f_{yy}(x,y) - \left( f_{xy}(x,y) \right)^2 ,

of H. Finally, suppose that (a, b) is a critical point of f (that is, fx(a, b) = fy(a, b) = 0). Then the second partial derivative test asserts the following:[1]

  1. If D(a,b)>0 and f_{xx}(a,b)>0 then (a,b) is a local minimum of f.
  2. If D(a,b)>0 and f_{xx}(a,b)<0 then (a,b) is a local maximum of f.
  3. If D(a,b)<0 then (a,b) is a saddle point of f.
  4. If D(a,b)=0 then the second derivative test is inconclusive, and the point (a, b) could be any of a minimum, maximum or saddle point.

Note that other equivalent versions of the test are possible. For example, some texts may use the trace fxx + fyy in place of the value fxx in the first two cases above.[citation needed] Such variations in the procedure applied do not alter the outcome of the test.

Functions of many variables[edit]

For a function f of more than two variables, there is a generalization of the rule above. In this context, instead of examining the determinant of the Hessian matrix, one must look at the eigenvalues of the Hessian matrix at the critical point. The following test can be applied at any critical point (a, b, ...) for which the Hessian matrix is invertible:

  1. If the Hessian is positive definite (equivalently, has all eigenvalues positive) at (a, b, ...), then f attains a local minimum at (a, b, ...).
  2. If the Hessian is negative definite (equivalently, has all eigenvalues negative) at (a, b, ...), then f attains a local maximum at (a, b, ...).
  3. If the Hessian has both positive and negative eigenvalues then (a, b, ...) is a saddle point for f (and in fact this is true even if (a, b, ...) is degenerate).

In those cases not listed above, the test is inconclusive.[2]

Note that for functions of three or more variables, the determinant of the Hessian does not provide enough information to classify the critical point, because the number of jointly sufficient second-order conditions is equal to the number of variables, and the sign condition on the determinant of the Hessian is only one of the conditions. Note also that this statement of the second derivative test for many variables also applies in the two-variable and one-variable case. In the latter case, we recover the usual second derivative test.

In the two variable case, D(a, b) and f_{xx}(a,b) are the principal minors of the Hessian. The first two conditions listed above on the signs of these minors are the conditions for the positive or negative definiteness of the Hessian. For the general case of an arbitrary number n of variables, there are n sign conditions on the n principal minors of the Hessian matrix that together are equivalent to positive or negative definiteness of the Hessian: for a local minimum, all the principal minors need to be positive, while for a local maximum, the minors with an odd number of rows and columns need to be negative and the minors with an even number of rows and columns need to be positive. See Hessian matrix#Bordered Hessian for a discussion that generalizes these rules to the case of equality-constrained optimization.

Geometric interpretation in the two-variable case[edit]

Assume that all derivatives of f are evaluated at (a, b), and that the values of the first derivatives vanish there.

If D<0 then f_{xx}f_{yy} < f_{xy}^2. If f_{xx} and f_{yy} have different signs, then one must be positive and the other must be negative. Thus the concavities of the x cross section (the yz trace) and the y cross section (the xz trace) are in opposite direction. This is clearly a saddle point.

If D>0 then f_{xx}f_{yy} > f_{xy}^2, which implies that f_{xx} and f_{yy} are the same sign and sufficiently large. For this case the concavities of the x and y cross sections are either both up if positive, or both down if negative. This is clearly a local minimum or a local maximum, respectively.

This leaves the last case of D < 0 — so f_{xx}f_{yy} < f_{xy}^2 — and f_{xx} and f_{yy} having the same sign. The geometric interpretation of what is happening here is that since f_{xy} is large it means the slope of the graph in one direction is changing rapidly as we move in the orthogonal direction and overcoming the concavity of the orthogonal direction. So for example, let's take the case of all second derivatives are positive and (a,b) = (0,0). In the case of D > 0 it would mean that any direction in the xy plane we move from the origin, the value of the function increases--a local minimum. In the D < 0 case (f_{xy} sufficiently large), however, if we move at some direction between the x and y axis into the second quadrant, for example, of the xy plane, then despite the fact that the positive concavity would cause us to expect the value of the function to increase, the slope in the x direction is increasing even faster, which means that as we go left (negative x-direction) into the second quadrant, the value of the function ends up decreasing. Additionally, since the origin is a stationary point by hypothesis, we have a saddle point.

Examples[edit]

critical points of f(x, y) = (x+y)(xy + xy^2)
maxima (red) and saddle points (blue)

To find and classify the critical points of the function

 z = f(x, y) = (x+y)(xy + xy^2) ,

we first set the partial derivatives

 \frac{\partial z}{\partial x} = y(2x +y)(y+1) and  \frac{\partial z}{\partial y} = x \left( 3y^2 +2y(x+1) + x \right)

equal to zero and solve the resulting equations simultaneously to find the four critical points

(0,0), (0, -1), (1,-1) and \left(\frac{3}{8}, -\frac{3}{4}\right).

In order to classify the critical points, we examine the value of the determinant D(x, y) of the Hessian of f at each of the four critical points. We have


\begin{align}
 D(a, b) &= f_{xx}(a,b)f_{yy}(a,b) - \left( f_{xy}(a,b) \right)^2 \\ 
         &= 2b(b+1) \cdot 2a(a + 3b + 1) - (2a + 2b + 4ab + 3b^2)^2.
\end{align}

Now we plug in all the different critical values we found to label them; we have

D(0, 0) = 0; ~~ D(0, -1) = -1; ~~ D(1, -1) = -1; ~~ D\left(\frac{3}{8}, -\frac{3}{4}\right) = \frac{27}{128}.

Thus, the second partial derivative test indicates that f(x, y) has saddle points at (0, −1) and (1, −1) and has a local maximum at \left(\frac{3}{8}, -\frac{3}{4}\right) since  f_{xx} = -\frac{3}{8} < 0. At the remaining critical point (0, 0) the second derivative test is insufficient, and one must use higher order tests or other tools to determine the behavior of the function at this point. (In fact, one can show that f takes both positive and negative values in small neighborhoods around (0, 0) and so this point is a saddle point of f.)

Notes[edit]

  1. ^ Stewart 2004, p. 803.
  2. ^ Kurt Endl/Wolfgang Luh: Analysis II. Aula-Verlag 1972, 7th edition 1989, ISBN 3-89104-455-0, pp. 248-258 (German)

References[edit]

External links[edit]