Root-finding algorithm

A root-finding algorithm is a numerical method, or algorithm, for finding a value x such that f(x) = 0, for a given function f. Such an x is called a root of the function f.

This article is concerned with finding scalar, real or complex roots, approximated as floating point numbers. Finding integer roots or exact algebraic roots are separate problems, whose algorithms have little in common with those discussed here. (See: Diophantine equation as for integer roots)

Finding a root of f(x)g(x) = 0 is the same as solving the equation f(x) = g(x). Here, x is called the unknown in the equation. Conversely, any equation can take the canonical form f(x) = 0, so equation solving is the same thing as computing (or finding) a root of a function.

Numerical root-finding methods use iteration, producing a sequence of numbers that hopefully converge towards a limit (the so-called "fixed point") which is a root. The first values of this series are initial guesses. The method computes subsequent values based on the old ones and the function f.

The behaviour of root-finding algorithms is studied in numerical analysis. Algorithms perform best when they take advantage of known characteristics of the given function. Thus an algorithm to find isolated real roots of a low-degree polynomial in one variable may bear little resemblance to an algorithm for complex roots of a "black-box" function which is not even known to be differentiable. Questions include ability to separate close roots, robustness in achieving reliable answers despite inevitable numerical errors, and rate of convergence.

Specific algorithms

Bisection Method

The simplest root-finding algorithm is the bisection method. It works when f is a continuous function and it requires previous knowledge of two initial guesses, a and b, such that f(a) and f(b) have opposite signs. Although it is reliable, it converges slowly, gaining one bit of accuracy with each iteration.

Newton's Method (and similar derivative-based methods)

Newton's method assumes the function f to have a continuous derivative. Newton's method may not converge if started too far away from a root. However, when it does converge, it is faster than the bisection method, and is usually quadratic. Newton's method is also important because it readily generalizes to higher-dimensional problems. Newton-like methods with higher orders of convergence are the Householder's methods. The first one after Newton's method is Halley's method with cubic order of convergence.

Secant Method

Replacing the derivative in Newton's method with a finite difference, we get the secant method. This method does not require the computation (nor the existence) of a derivative, but the price is slower convergence (the order is approximately 1.6). A generalization of the secant method in higher dimensions is Broyden's method.

False Position (Regula Falsi)

The false position method, also called the regula falsi method, is like the secant method. However, instead of retaining the last two points, it makes sure to keep one point on either side of the root. The false position method is faster than the bisection method and more robust than the secant method, but requires the two starting points to bracket the root. Ridders' method is a variant on the false-position method that also evaluates the function at the midpoint of the interval, giving faster convergence with similar robustness.

Interpolation

The secant method also arises if one approximates the unknown function f by linear interpolation. When quadratic interpolation is used instead, one arrives at Muller's method. It converges faster than the secant method. A particular feature of this method is that the iterates xn may become complex.

Sidi's method allows for interpolation with an arbitrarily high degree polynomial. The higher the degree of the interpolating polynomial, the faster the convergence. Sidi's method allows for convergence with an order arbitrarily close to 2.

Inverse Interpolation

This can be avoided by interpolating the inverse of f, resulting in the inverse quadratic interpolation method. Again, convergence is asymptotically faster than the secant method, but inverse quadratic interpolation often behaves poorly when the iterates are not close to the root.

Combinations of Methods

Brent's Method

Brent's method is a combination of the bisection method, the secant method and inverse quadratic interpolation. At every iteration, Brent's method decides which method out of these three is likely to do best, and proceeds by doing a step according to that method. This gives a robust and fast method, which therefore enjoys considerable popularity.

Finding roots of polynomials

Much attention has been given to the special case that the function f is a polynomial; there exist root-finding algorithms exploiting the polynomial nature of f. For a univariate polynomial of degree less than five, there are closed form solutions such as the quadratic formula which produce all roots. However, even this degree-two solution should be used with care to ensure numerical stability. Even more care must be taken with the degree-three and degree-four solutions because of their complexity. Higher-degree polynomials have no such general solution, according to the Abel–Ruffini theorem (1824, 1799).

Birge-Vieta's method combines Horner's method of polynomial evaluation with Newton-Raphson to provide a computational speed-up.

For real roots, Sturm's theorem and Descartes' rule of signs provide guides to locating and separating roots. This plus interval arithmetic combined with Newton's method yields robust and fast algorithms.

The algorithm for isolating the roots, using Descartes' rule of signs and Vincent's theorem, had been originally called modified Uspensky's algorithm by its inventors Collins and Akritas.[1] After going through names like "Collins-Akritas method" and "Descartes' method" (too confusing if ones considers Fourier's article[2]), it was finally François Boulier, of Lille University, who gave it the name Vincent-Collins-Akritas (VCA) method,[3] p. 24, based on the fact that "Uspensky's method" does not exist[4] and neither does "Descartes' method".[5] This algorithm has been improved by Rouillier and Zimmerman,[6] and the resulting implementation is, to the date, the fastest bisection method. It has the same worst case complexity as Sturm algorithm, but is almost always much faster. It is the default algorithm of Maple root-finding function fsolve. Another method based on Vincent's theorem is the Vincent–Akritas–Strzeboński (VAS) method;[7] it has been shown[8] that the VAS (continued fractions) method is faster than the fastest implementation of the VCA (bisection) method,[6] a fact that was independently confirmed elsewhere;[9] more precisely, for the Mignotte polynomials of high degree VAS is about 50,000 times faster than the fastest implementation of VCA. VAS is the default algorithm for root isolation in Mathematica, Sage, SymPy, Xcas. See Budan's theorem for a description of the historical background of these methods. For a comparison between Sturm's method and VAS use the functions realroot(poly) and time(realroot(poly)) of Xcas. By default, to isolate the real roots of poly realroot uses the VAS method; to use Sturm's method write realroot(sturm, poly). See also the External links for a pointer to an iPhone/iPod/iPad application that does the same thing.

One possibility is to form the companion matrix of the polynomial. Since the eigenvalues of this matrix coincide with the roots of the polynomial, one can use any eigenvalue algorithm to find the roots of the polynomial. For instance the classical Bernoulli's method to find the root greater in modulus, if it exists, turns out to be the power method applied to the companion matrix. The inverse power method, which finds some smallest root first, is what drives the Jenkins–Traub method and gives it its numerical stability and fast convergence even in the presence of multiple or clustered roots.

Laguerre's method, as well as Halley's method, use second order derivatives and complex arithmetics, including the complex square root, to exhibit cubic convergence for simple roots, dominating the quadratic convergence displayed by Newton's method.

Bairstow's method uses Newton's method to find quadratic factors of a polynomial with real coefficients. It can determine both real and complex roots of a real polynomial using only real arithmetic.

The simple Durand–Kerner and the slightly more complicated Aberth method simultaneously finds all the roots using only simple complex number arithmetic.

The splitting circle method is useful for finding the roots of polynomials of high degree to arbitrary precision; it has almost optimal complexity in this setting. Another method with this style is the Dandelin–Gräffe method (actually due to Lobachevsky) which factors the polynomial.

Wilkinson's polynomial illustrates that high precision may be necessary when computing the roots of a polynomial given its coefficients: the problem of finding the roots from the coefficients is in general ill-conditioned.

Finding multiple roots of polynomials

Most root-finding algorithms behave badly when there are multiple roots or very close roots. However, for polynomials whose coefficients are exactly given as integers or rational numbers, there is an efficient method to factorize them into factors that have only simple roots and whose coefficients are also exactly given. This method, called square-free factorization, is based on the fact that the multiple roots of a polynomial are the roots of the greatest common divisor of the polynomial and its derivative.

The square-free factorization of a polynomial p is a factorization $p=p_1p_2^2\cdots p_k^k$ where each $p_i$ is either 1 or a polynomial without multiple root, and two different $p_i$ do not have any common root.

An efficient method to compute this factorization is Yun's algorithm.