Root-finding algorithm

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In mathematics and computing, a root-finding algorithm is an algorithm for finding roots of continuous functions. A root of a function f, from the real numbers to real numbers or from the complex numbers to the complex numbers, is a number x such that f(x) = 0. As, generally, the roots of a function cannot be computed exactly, nor expressed in closed form, root-finding algorithms provide approximations to roots, expressed either as floating point numbers or as small isolating intervals, or disks for complex roots (an interval or disk output being equivalent to an approximate output together with an error bound).

Solving an equation f(x) = g(x) is the same as finding the roots of the function h(x) = f(x) – g(x). Thus root-finding algorithms allow solving any equation defined by continuous functions. However, most root-finding algorithms do not guarantee that they will find all the roots; in particular, if such an algorithm does not find any root, that does not mean that no root exists.

Most numerical root-finding methods use iteration, producing a sequence of numbers that hopefully converge towards the root as a limit. They require one or more initial guesses of the root as starting values, then each iteration of the algorithm produces a successively more accurate approximation to the root. Since the iteration must be stopped at some point these methods produce an approximation to the root, not an exact solution. Many methods compute subsequent values by evaluating an auxiliary function on the preceding values. The limit is thus a fixed point of the auxiliary function, which is chosen for having the roots of the original equation as fixed points, and for converging rapidly to these fixed points.

The behaviour of root-finding algorithms is studied in numerical analysis. The efficiency of an algorithm may depend dramatically on the characteristics of the given functions. For example, many algorithms use the derivative of the input function, while others work on every continuous function. In general, numerical algorithms are not guaranteed to find all the roots of a function, so failing to find a root does not prove that there is no root. However, for polynomials, there are specific algorithms that use algebraic properties for locating the roots in intervals (or disks for complex roots) that are small enough to ensure the convergence of numerical methods (typically Newton's method) to the unique root so located.

Bracketing methods[edit]

Bracketing methods determine successively smaller intervals (brackets) that contain a root. When the interval is small enough, then a root has been found. They generally use the intermediate value theorem, which asserts that if a continuous function has values of opposite signs at the end points of an interval, then the function has at least one root in the interval. (In the case of polynomials there are other methods, based on Sturm's theorem or Descartes' rule of signs, that give the exact number of roots in an interval.) These methods provide absolute error bounds on the root that is found.

Bisection method[edit]

The simplest root-finding algorithm is the bisection method. Let f be a continuous function, for which one knows an interval [a, b] such that f(a) and f(b) have opposite signs (a bracket). Let c = (a +b)/2 be the middle of the interval (the midpoint or the point that bisects the interval). Then either f(a) and f(c), or f(c) and f(b) have opposite signs, and one has divided by two the size of the interval. Although the bisection method is robust, it gains one and only one bit of accuracy with each iteration. Other methods, under appropriate conditions, can gain accuracy faster.

False position (regula falsi)[edit]

The false position method, also called the regula falsi method, is similar to the bisection method, but instead of using bisection search's middle of the interval it uses the x-intercept of the line that connects the plotted function values at the endpoints of the interval, that is

False position is similar to the secant method, except that, instead of retaining the last two points, it makes sure to keep one point on either side of the root. The false position method can be faster than the bisection method and will never diverge like the secant method; however, it may fail to converge in some naive implementations due to roundoff errors.[clarification needed]

Ridders' method is a variant of the false position method that uses the value of function at the midpoint of the interval, for getting a function with the same root, to which the false position method is applied. This gives a faster convergence with a similar robustness.


Many root-finding processes work by interpolation. This consists in using the last computed approximate values of the root for approximating the function by a polynomial of low degree, which takes the same values at these approximate roots. Then the root of the polynomial is computed and used as a new approximate value of the root of the function, and the process is iterated.

Two values allow interpolating a function by a polynomial of degree one (that is approximating the graph of the function by a line). This is the basis of the secant method. Three values define a quadratic function, which approximates the graph of the function by a parabola. This is Muller's method.

Regula falsi is also an interpolation method, which differs secant method by using, for interpolating by a line, two points that are not necessarily the last two computed points.

Iterative methods[edit]

Although all root-finding algorithms proceed by iteration, an iterative root-finding method generally use a specific type of iteration, consisting of defining an auxiliary function, which is applied to the last computed approximations of a root for getting a new approximation. The iteration stops when a fixed point (up to the desired precision) of the auxiliary function is reached, that is when the new computed value is sufficiently close to the preceding ones.

Newton's method (and similar derivative-based methods)[edit]

Newton's method assumes the function f to have a continuous derivative. Newton's method may not converge if started too far away from a root. However, when it does converge, it is faster than the bisection method, and is usually quadratic. Newton's method is also important because it readily generalizes to higher-dimensional problems. Newton-like methods with higher orders of convergence are the Householder's methods. The first one after Newton's method is Halley's method with cubic order of convergence.

Secant method[edit]

Replacing the derivative in Newton's method with a finite difference, we get the secant method. This method does not require the computation (nor the existence) of a derivative, but the price is slower convergence (the order is approximately 1.6). A generalization of the secant method in higher dimensions is Broyden's method.

Steffensen's method[edit]

If we use a polynomial fit to remove the quadratic part of the finite difference used in the Secant method, so that it better approximates the derivative, we obtain Steffensen's method, which has quadratic convergence, and whose behavior (both good and bad) is essentially the same Newton's method, but does not require a derivative.

Inverse interpolation[edit]

The appearance of complex values in interpolation methods can be avoided by interpolating the inverse of f, resulting in the inverse quadratic interpolation method. Again, convergence is asymptotically faster than the secant method, but inverse quadratic interpolation often behaves poorly when the iterates are not close to the root.

Combinations of methods[edit]

Brent's method[edit]

Brent's method is a combination of the bisection method, the secant method and inverse quadratic interpolation. At every iteration, Brent's method decides which method out of these three is likely to do best, and proceeds by doing a step according to that method. This gives a robust and fast method, which therefore enjoys considerable popularity.

Finding roots of polynomials[edit]

Much attention has been given to the special case that the function f is a polynomial, and there are several root-finding algorithms for polynomials. For univariate polynomials of degrees one (linear polynomial) through four (quartic polynomial), there are closed-form solutions which produce all roots. Linear polynomials are easy to solve, but using the quadratic formula to solve quadratic (second degree) equations may require some care to ensure numerical stability. The closed-form solutions for degrees three (cubic polynomial) and four (quartic polynomial) are complicated and require much more care; consequently, numerical methods such as Bairstow's method may be easier to use. Fifth-degree (quintic) and higher-degree polynomials do not have a general algebraic solution according to the Abel–Ruffini theorem (1824, 1799).

Finding one root at a time[edit]

The general idea is to find a root of the polynomial and then apply Horner's method to remove the corresponding factor according to the Ruffini rule.

This iterative scheme is numerically unstable; the approximation errors accumulate during the successive factorizations, so that the last roots are determined with a polynomial that deviates widely from a factor of the original polynomial. To reduce this error, it is advisable to find the roots in increasing order of magnitude.

Wilkinson's polynomial illustrates that high precision may be necessary when computing the roots of a polynomial given its coefficients: the problem of finding the roots from the coefficients is in general ill-conditioned.

The most simple method to find a single root fast is Newton's method. One can use Horner's method twice to efficiently evaluate the value of the polynomial function and its first derivative; this combination is called Birge–Vieta's method. This method provides quadratic convergence for simple roots at the cost of two polynomial evaluations per step.

Closely related to Newton's method are Halley's method and Laguerre's method. Using one additional Horner evaluation, the value of the second derivative is used to obtain methods of cubic convergence order for simple roots. If one starts from a point x close to a root and uses the same complexity of 6 function evaluations, these methods perform two steps with a residual of O(|f(x)|9), compared to three steps of Newton's method with a reduction O(|f(x)|8), giving a slight advantage to these methods.

When applying these methods to polynomials with real coefficients and real starting points, Newton's and Halley's method stay inside the real number line. One has to choose complex starting points to find complex roots. In contrast, the Laguerre method with a square root in its evaluation will leave the real axis of its own accord.

Another class of methods is based on translating the problem of finding polynomial roots to the problem of finding eigenvalues of the companion matrix of the polynomial. In principle, one can use any eigenvalue algorithm to find the roots of the polynomial. However, for efficiency reasons one prefers methods that employ the structure of the matrix, that is, can be implemented in matrix-free form. Among these methods are the power method, whose application to the transpose of the companion matrix is the classical Bernoulli's method to find the root of greatest modulus. The inverse power method with shifts, which finds some smallest root first, is what drives the complex (cpoly) variant of the Jenkins–Traub method and gives it its numerical stability. Additionally, it is insensitive to multiple roots and has fast convergence with order even in the presence of clustered roots. This fast convergence comes with a cost of three Horner evaluations per step, resulting in a residual of O(|f(x)|2+3Φ), which is slower than three steps of Newton's method.

Finding roots in pairs[edit]

If the given polynomial only has real coefficients, one may wish to avoid computations with complex numbers. To that effect, one has to find quadratic factors for pairs of conjugate complex roots. The application of the multi-dimensional Newton's method to this task results in Bairstow's method. In the framework of inverse power iterations of the companion matrix, the double shift method of Francis results in the real (rpoly) variant of the Jenkins–Traub method.

Finding all roots at once[edit]

The simple Durand–Kerner and the slightly more complicated Aberth methods simultaneously find all of the roots using only simple complex number arithmetic. Accelerated algorithms for multi-point evaluation and interpolation similar to the fast Fourier transform can help speed them up for large degrees of the polynomial. It is advisable to choose an asymmetric, but evenly distributed set of initial points.

Another method with this style is the Dandelin–Gräffe method (sometimes also ascribed to Lobachevsky), which uses polynomial transformations to repeatedly and implicitly square the roots. This greatly magnifies variances in the roots. Applying Viète's formulas, one obtains easy approximations for the modulus of the roots, and with some more effort, for the roots themselves.

Exclusion and enclosure methods[edit]

Several fast tests exist that tell if a segment of the real line or a region of the complex plane contains no roots. By bounding the modulus of the roots and recursively subdividing the initial region indicated by these bounds, one can isolate small regions that may contain roots and then apply other methods to locate them exactly.

All these methods involve finding the coefficients of shifted and scaled versions of the polynomial. For large degrees, FFT-based accelerated methods become viable.

For real roots, Sturm's theorem and Descartes' rule of signs with its extension in the Budan–Fourier theorem provide guides to locating and separating roots. This plus interval arithmetic combined with Newton's method yields robust and fast algorithms.

The Lehmer–Schur algorithm uses the Schur–Cohn test for circles, Wilf's global bisection algorithm uses a winding number computation for rectangular regions in the complex plane.

The splitting circle method uses FFT-based polynomial transformations to find large-degree factors corresponding to clusters of roots. The precision of the factorization is maximized using a Newton-type iteration. This method is useful for finding the roots of polynomials of high degree to arbitrary precision; it has almost optimal complexity in this setting.

Method based on the Budan–Fourier theorem or Sturm chains[edit]

The methods in this class give for polynomials with rational coefficients, and when carried out in rational arithmetic, provably complete enclosures of all real roots by rational intervals. The test of an interval for real roots using Budan's theorem is computationally simple but may yield false positive results. However, these will be reliably detected in the following transformation and refinement of the interval. The test based on Sturm chains is computationally more involved but gives always the exact number of real roots in the interval.

The algorithm for isolating the roots, using Descartes' rule of signs and Vincent's theorem, had been originally called modified Uspensky's algorithm by its inventors Collins and Akritas.[1] After going through names like "Collins–Akritas method" and "Descartes' method" (too confusing if ones considers Fourier's article[2]), it was finally François Boulier, of Lille University, who gave it the name Vincent–Collins–Akritas (VCA) method,[3] p. 24, based on "Uspensky's method" not existing[4] and neither does "Descartes' method".[5] This algorithm has been improved by Rouillier and Zimmerman,[6] and the resulting implementation is, to date,[when?] the fastest bisection method. It has the same worst case complexity as the Sturm algorithm, but is almost always much faster. It is the default algorithm of Maple root-finding function fsolve. Another method based on Vincent's theorem is the Vincent–Akritas–Strzeboński (VAS) method;[7] it has been shown[8] that the VAS (continued fractions) method is faster than the fastest implementation of the VCA (bisection) method,[6] which was independently confirmed elsewhere;[9] more precisely, for the Mignotte polynomials of high degree, VAS is about 50,000 times faster than the fastest implementation of VCA. VAS is the default algorithm for root isolation in Mathematica, SageMath, SymPy, Xcas. See Budan's theorem for a description of the historical background of these methods. For a comparison between Sturm's method and VAS, use the functions realroot(poly) and time(realroot(poly)) of Xcas. By default, to isolate the real roots of poly realroot uses the VAS method; to use Sturm's method, write realroot(sturm, poly). See also the External links for a pointer to an iPhone/iPod/iPad application that does the same thing.

Finding multiple roots of polynomials[edit]

Most root-finding algorithms behave badly when there are multiple roots or very close roots. However, for polynomials whose coefficients are exactly given as integers or rational numbers, there is an efficient method to factorize them into factors that have only simple roots and whose coefficients are also exactly given. This method, called square-free factorization, is based on the multiple roots of a polynomial being the roots of the greatest common divisor of the polynomial and its derivative.

The square-free factorization of a polynomial p is a factorization where each is either 1 or a polynomial without multiple roots, and two different do not have any common root.

An efficient method to compute this factorization is Yun's algorithm.

See also[edit]



  1. ^ Collins, George E.; Akritas, Alkiviadis G. (1976). Polynomial Real Root Isolation Using Descartes' Rule of Signs. SYMSAC '76, Proceedings of the third ACM symposium on Symbolic and algebraic computation. Yorktown Heights, NY, USA: ACM. pp. 272–275. 
  2. ^ Fourier, Jean-Baptiste Joseph (1820). "Sur l'usage du théorème de Descartes dans la recherche des limites des raciness" [On the use of Descartes' theorem in the search for the bounds of roots]. Bulletin des Sciences, par la Société Philomatique de Paris (in French): 156–165. 
  3. ^ Boulier, François (2010). Systèmes polynomiaux : que signifie " résoudre " ? [Polynomial systems: what does "solve" mean?] (PDF) (in French). Université Lille 1. 
  4. ^ Akritas, Alkiviadis G. (1986). There's no "Uspensky's Method". Proceedings of the fifth ACM Symposium on Symbolic and Algebraic Computation (SYMSAC '86). Waterloo, Ontario, Canada. pp. 88–90. 
  5. ^ Akritas, Alkiviadis G. (2008). "There is no "Descartes' method"". In Wester, M. J.; Beaudin, M. Computer Algebra in Education (PDF). USA: Aullona Press. pp. 19–35. 
  6. ^ a b Rouillier, F.; Zimmerman, P. (2004). "Efficient isolation of polynomial's real roots". Journal of Computational and Applied Mathematics. 162. doi:10.1016/ 
  7. ^ Akritas, Alkiviadis G.; Strzeboński, A. W.; Vigklas, P. S. (2008). "Improving the performance of the continued fractions method using new bounds of positive roots" (PDF). Nonlinear Analysis: Modelling and Control. 13: 265–279. 
  8. ^ Akritas, Alkiviadis G.; Strzeboński, Adam W. (2005). "A Comparative Study of Two Real Root Isolation Methods" (PDF). Nonlinear Analysis: Modelling and Control. 10 (4): 297–304. 
  9. ^ Tsigaridas, P. E.; Emiris, I. Z. (2006). "Univariate polynomial real root isolation: Continued fractions revisited". LNCS. 4168: 817–828. arXiv:cs/0604066Freely accessible. doi:10.1007/11841036_72. 


External links[edit]