Halley's method

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In numerical analysis, Halley’s method is a root-finding algorithm used for functions of one real variable with a continuous second derivative, i.e., C2 functions. It is named after its inventor Edmond Halley.

The algorithm is second in the class of Householder's methods, right after Newton's method. Like the latter, it produces iteratively a sequence of approximations to the root; their rate of convergence to the root is cubic. Multidimensional versions of this method exist.

Halley's method can be viewed as exactly finding the roots of a linear-over-linear Padé approximation to the function, in contrast to Newton's method/Secant method (approximates/interpolates the function linearly) or Cauchy's method/Muller's method (approximates/interpolates the function quadratically).[1]

Method[edit]

Halley’s method is a numerical algorithm for solving the nonlinear equation f(x) = 0. In this case, the function f has to be a function of one real variable. The method consists of a sequence of iterations:

x_{n+1} = x_n - \frac {2 f(x_n) f'(x_n)} {2 {[f'(x_n)]}^2 - f(x_n) f''(x_n)}

beginning with an initial guess x0.

If f is a three times continuously differentiable function and a is a zero of f but not of its derivative, then, in a neighborhood of a, the iterates xn satisfy:

| x_{n+1} - a | \le K \cdot {| x_n - a |}^3,\text{ for some }K > 0.

This means that the iterates converge to the zero if the initial guess is sufficiently close, and that the convergence is cubic.

The following alternative formulation shows the similarity between Halley’s method and Newton’s method. The expression f(x_n)/f'(x_n) is computed only once, and it is particularly useful when f''(x_n)/f'(x_n) can be simplified.

x_{n+1} = x_n -  \frac {f(x_n)} {f'(x_n)} \left[1 - \frac {f(x_n)}{f'(x_n)} \cdot \frac {f''(x_n)} {2 f'(x_n)} \right]^{-1}

A further alternative is as below, in which case the technique is sometimes referred to as Bailey’s method.[2]

x_{n+1} = \frac{x_n -  f(x_n)}{f'(x_n) - \frac{f(x_n) f''(x_n)} {2 f'(x_n)}}

Using any variation, when the second derivative is very close to zero, the iteration is almost the same as under Newton’s method.

Derivation[edit]

Consider the function

g(x) = \frac {f(x)} {\sqrt{|f'(x)|}}.

Any root of f which is not a root of its derivative is a root of g; and any root of g is a root of f. Applying Newton's method to g gives

x_{n+1} = x_n - \frac{g(x_n)}{g'(x_n)}

with

g'(x) = \frac{2[f'(x)]^2 - f(x) f''(x)}{2 f'(x) \sqrt{|f'(x)|}},

and the result follows. Notice that if f′(c) = 0, then one cannot apply this at c because g(c) would be undefined.

Cubic convergence[edit]

Suppose a is a root of f but not of its derivative. And suppose that the third derivative of f exists and is continuous in a neighborhood of a and xn is in that neighborhood. Then Taylor's theorem implies:

0 = f(a) = f(x_n) + f'(x_n) (a - x_n) + \frac{f''(x_n)}{2} (a - x_n)^2 + \frac{f'''(\xi)}{6} (a - x_n)^3

and also

0 = f(a) = f(x_n) + f'(x_n) (a - x_n) + \frac{f''(\eta)}{2} (a - x_n)^2,

where ξ and η are numbers lying between a and xn. Multiply the first equation by 2f'(x_n) and subtract from it the second equation times f''(x_n)(a - x_n) to give:

\begin{align}
0 &= 2 f(x_n) f'(x_n) + 2 [f'(x_n)]^2 (a - x_n) + f'(x_n) f''(x_n) (a - x_n)^2 + \frac{f'(x_n) f'''(\xi)}{3} (a - x_n)^3 \\
&\qquad- f(x_n) f''(x_n) (a - x_n) - f'(x_n) f''(x_n) (a - x_n)^2 - \frac{f''(x_n) f''(\eta)}{2} (a - x_n)^3.
\end{align}

Canceling f'(x_n) f''(x_n) (a - x_n)^2 and re-organizing terms yields:

0 = 2 f(x_n) f'(x_n) + \left(2 [f'(x_n)]^2 - f(x_n) f''(x_n) \right) (a - x_n) + \left(\frac{f'(x_n) f'''(\xi)}{3} - \frac{f''(x_n) f''(\eta)}{2} \right) (a - x_n)^3.

Put the second term on the left side and divide through by

 2 [f'(x_n)]^2 - f(x_n) f''(x_n)

to get:

a - x_n = \frac{-2f(x_n) f'(x_n)}{2[f'(x_n)]^2 - f(x_n) f''(x_n)} - \frac{2f'(x_n) f'''(\xi) - 3 f''(x_n) f''(\eta)}{6(2 [f'(x_n)]^2 - f(x_n) f''(x_n))} (a - x_n)^3.

Thus:

a - x_{n+1} = - \frac{2 f'(x_n) f'''(\xi) - 3 f''(x_n) f''(\eta)}{12[f'(x_n)]^2 - 6 f(x_n) f''(x_n)} (a - x_n)^3.

The limit of the coefficient on the right side as xna is:

-\frac{2 f'(a) f'''(a) - 3 f''(a) f''(a)}{12 [f'(a)]^2}.

If we take K to be a little larger than the absolute value of this, we can take absolute values of both sides of the formula and replace the absolute value of coefficient by its upper bound near a to get:

|a - x_{n+1}| \leq K |a - x_n|^3

which is what was to be proved.

To summarize,

\Delta x_{i+1} =\frac{3(f'')^2 - 2f' f'''}{12(f')^2} (\Delta x_i)^3 + O[\Delta x_i]^4, \qquad \Delta x_i \triangleq x_i - a.

References[edit]

Notes

  1. ^ Boyd, J. P. (2013). "Finding the Zeros of a Univariate Equation: Proxy Rootfinders, Chebyshev Interpolation, and the Companion Matrix". SIAM Review 55 (2): 375–396. doi:10.1137/110838297.  edit
  2. ^ See for example the Bond Exchange of South Africa’s Bond Pricing Formula Specifications, where Bailey’s method is suggested when solving for a bond’s Yield to maturity.

Sources

External links[edit]