Root-finding algorithm: Difference between revisions

Content deleted Content added

Inline

Revision as of 12:48, 16 April 2023

In mathematics and computing, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function $f$ , from the real numbers to real numbers or from the complex numbers to the complex numbers, is a number $x$ such that $f (x) = 0$ . As, generally, the zeros of a function cannot be computed exactly nor expressed in closed form, root-finding algorithms provide approximations to zeros, expressed either as floating-point numbers or as small isolating intervals, or disks for complex roots (an interval or disk output being equivalent to an approximate output together with an error bound).^[1]

Solving an equation $f (x) = g (x)$ is the same as finding the roots of the function $h (x) = f (x) - g (x)$ . Thus root-finding algorithms allow solving any equation defined by continuous functions. However, most root-finding algorithms do not guarantee that they will find all the roots; in particular, if such an algorithm does not find any root, that does not mean that no root exists.

Most numerical root-finding methods use iteration, producing a sequence of numbers that hopefully converge towards the root as a limit. They require one or more initial guesses of the root as starting values, then each iteration of the algorithm produces a successively more accurate approximation to the root. Since the iteration must be stopped at some point these methods produce an approximation to the root, not an exact solution. Many methods compute subsequent values by evaluating an auxiliary function on the preceding values. The limit is thus a fixed point of the auxiliary function, which is chosen for having the roots of the original equation as fixed points, and for converging rapidly to these fixed points.

The behaviour of general root-finding algorithms is studied in numerical analysis. However, for polynomials, root-finding study belongs generally to computer algebra, since algebraic properties of polynomials are fundamental for the most efficient algorithms. The efficiency of an algorithm may depend dramatically on the characteristics of the given functions. For example, many algorithms use the derivative of the input function, while others work on every continuous function. In general, numerical algorithms are not guaranteed to find all the roots of a function, so failing to find a root does not prove that there is no root. However, for polynomials, there are specific algorithms that use algebraic properties for certifying that no root is missed, and locating the roots in separate intervals (or disks for complex roots) that are small enough to ensure the convergence of numerical methods (typically Newton's method) to the unique root so located.

Bracketing methods

Bracketing methods determine successively smaller intervals (brackets) that contain a root. When the interval is small enough, then a root has been found. They generally use the intermediate value theorem, which asserts that if a continuous function has values of opposite signs at the end points of an interval, then the function has at least one root in the interval. Therefore, they require to start with an interval such that the function takes opposite signs at the end points of the interval. However, in the case of polynomials there are other methods (Descartes' rule of signs, Budan's theorem and Sturm's theorem) for getting information on the number of roots in an interval. They lead to efficient algorithms for real-root isolation of polynomials, which ensure finding all real roots with a guaranteed accuracy.

Bisection method

The simplest root-finding algorithm is the bisection method. Let $f$ be a continuous function, for which one knows an interval $[a, b]$ such that $f (a)$ and $f (b)$ have opposite signs (a bracket). Let $c = (a + b)/2$ be the middle of the interval (the midpoint or the point that bisects the interval). Then either $f (a)$ and $f (c)$ , or $f (c)$ and $f (b)$ have opposite signs, and one has divided by two the size of the interval. Although the bisection method is robust, it gains one and only one bit of accuracy with each iteration. Other methods, under appropriate conditions, can gain accuracy faster.

False position (regula falsi)

The false position method, also called the regula falsi method, is similar to the bisection method, but instead of using bisection search's middle of the interval it uses the $x$ -intercept of the line that connects the plotted function values at the endpoints of the interval, that is

c={\frac {af(b)-bf(a)}{f(b)-f(a)}}.

False position is similar to the secant method, except that, instead of retaining the last two points, it makes sure to keep one point on either side of the root. The false position method can be faster than the bisection method and will never diverge like the secant method; however, it may fail to converge in some naive implementations due to roundoff errors that may lead to a wrong sign for $f (c)$ ; typically, this may occur if the rate of variation of $f$ is large in the neighborhood of the root.

ITP method

The ITP method is the only known method to bracket the root with the same worst case guarantees of the bisection method while guaranteeing a superlinear convergence to the root of smooth functions as the secant method. It is also the only known method guaranteed to outperform the bisection method on the average for any continuous distribution on the location of the root (see ITP Method#Analysis). It does so by keeping track of both the bracketing interval as well as the minmax interval in which any point therein converges as fast as the bisection method. The construction of the queried point c follows three steps: interpolation (similar to the regula falsi), truncation (adjusting the regula falsi similar to Regula falsi § Improvements in regula falsi) and then projection onto the minmax interval. The combination of these steps produces a simultaneously minmax optimal method with guarantees similar to interpolation based methods for smooth functions, and, in practice will outperform both the bisection method and interpolation based methods under both smooth and non-smooth functions.

Interpolation

Many root-finding processes work by interpolation. This consists in using the last computed approximate values of the root for approximating the function by a polynomial of low degree, which takes the same values at these approximate roots. Then the root of the polynomial is computed and used as a new approximate value of the root of the function, and the process is iterated.

Two values allow interpolating a function by a polynomial of degree one (that is approximating the graph of the function by a line). This is the basis of the secant method. Three values define a quadratic function, which approximates the graph of the function by a parabola. This is Muller's method.

Regula falsi is also an interpolation method, which differs from the secant method by using, for interpolating by a line, two points that are not necessarily the last two computed points.

Iterative methods

Although all root-finding algorithms proceed by iteration, an iterative root-finding method generally uses a specific type of iteration, consisting of defining an auxiliary function, which is applied to the last computed approximations of a root for getting a new approximation. The iteration stops when a fixed point (up to the desired precision) of the auxiliary function is reached, that is when the new computed value is sufficiently close to the preceding ones.

Newton's method (and similar derivative-based methods)

Newton's method assumes the function f to have a continuous derivative. Newton's method may not converge if started too far away from a root. However, when it does converge, it is faster than the bisection method, and is usually quadratic. Newton's method is also important because it readily generalizes to higher-dimensional problems. Newton-like methods with higher orders of convergence are the Householder's methods. The first one after Newton's method is Halley's method with cubic order of convergence.

Secant method

Replacing the derivative in Newton's method with a finite difference, we get the secant method. This method does not require the computation (nor the existence) of a derivative, but the price is slower convergence (the order is approximately 1.6 (golden ratio)). A generalization of the secant method in higher dimensions is Broyden's method.

Steffensen's method

If we use a polynomial fit to remove the quadratic part of the finite difference used in the Secant method, so that it better approximates the derivative, we obtain Steffensen's method, which has quadratic convergence, and whose behavior (both good and bad) is essentially the same as Newton's method but does not require a derivative.

Fixed point iteration method

We can use the fixed-point iteration to find the root of a function. Given a function $f(x)$ which we have set to zero to find the root ( $f(x)=0$ ), we rewrite the equation in terms of $x$ so that $f(x)=0$ becomes $x=g(x)$ (note, there are often many $g(x)$ functions for each $f(x)=0$ function). Next, we relabel the each side of the equation as $x_{n+1}=g(x_{n})$ so that we can perform the iteration. Next, we pick a value for $x_{1}$ and perform the iteration until it converges towards a root of the function. If the iteration converges, it will converge to a root. The iteration will only converge if $|g'(root)|<1$ .

As an example of converting $f(x)=0$ to $x=g(x)$ , if given the function $f(x)=x^{2}+x-1$ , we will rewrite it as one of the following equations.

x_{n+1}=(1/x_{n})-1

,

x_{n+1}=1/(x_{n}+1)

,

x_{n+1}=1-x_{n}^{2}

,

x_{n+1}=x_{n}^{2}+2x_{n}-1

, or

x_{n+1}=\pm {\sqrt {1-x_{n}}}

.

Inverse interpolation

The appearance of complex values in interpolation methods can be avoided by interpolating the inverse of f, resulting in the inverse quadratic interpolation method. Again, convergence is asymptotically faster than the secant method, but inverse quadratic interpolation often behaves poorly when the iterates are not close to the root.

Combinations of methods

Brent's method

Brent's method is a combination of the bisection method, the secant method and inverse quadratic interpolation. At every iteration, Brent's method decides which method out of these three is likely to do best, and proceeds by doing a step according to that method. This gives a robust and fast method, which therefore enjoys considerable popularity.

Ridders' method

Ridders' method is a hybrid method that uses the value of function at the midpoint of the interval to perform an exponential interpolation to the root. This gives a fast convergence with a guaranteed convergence of at most twice the number of iterations as the bisection method.

Roots of polynomials

Finding polynomial roots is a long-standing problem that has been the object of much research throughout history. A testament to this is that up until the 19th century, algebra meant essentially theory of polynomial equations.

Finding roots in higher dimensions

^[2]

References

^
- Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P. (2007). "Chapter 9. Root Finding and Nonlinear Sets of Equations". Numerical Recipes: The Art of Scientific Computing (3rd ed.). New York: Cambridge University Press. ISBN 978-0-521-88068-8.
^ Vrahatis, Michael N. (2020). Sergeyev, Yaroslav D.; Kvasov, Dmitri E. (eds.). "Generalizations of the Intermediate Value Theorem for Approximating Fixed Points and Zeros of Continuous Functions". Numerical Computations: Theory and Algorithms. Cham: Springer International Publishing: 223–238. doi:10.1007/978-3-030-40616-5_17. ISBN 978-3-030-40616-5.

@@ Line 1: / Line 1: @@
 {{Short description|Algorithms for zeros of functions}}
-In [[mathematics]] and [[computing]], a '''root-finding algorithm''' is an [[algorithm]] for finding [[Zero of a function|zeros]], also called "roots", of [[continuous function]]s. A [[zero of a function]] {{math|''f''}}, from the [[real number]]s to real numbers or from the [[complex number]]s to the complex numbers, is a number {{math|''x''}} such that {{math|1=''f''(''x'') = 0}}. As, generally, the zeros of a function cannot be computed exactly nor expressed in [[closed form expression|closed form]], root-finding algorithms provide approximations to zeros, expressed either as [[floating-point arithmetic|floating-point]] numbers or as small isolating [[interval (mathematics)|intervals]],  or [[disk (mathematics)|disks]] for complex roots (an interval or disk output being equivalent to an approximate output together with an error bound).
+In [[mathematics]] and [[computing]], a '''root-finding algorithm''' is an [[algorithm]] for finding [[Zero of a function|zeros]], also called "roots", of [[continuous function]]s. A [[zero of a function]] {{math|''f''}}, from the [[real number]]s to real numbers or from the [[complex number]]s to the complex numbers, is a number {{math|''x''}} such that {{math|1=''f''(''x'') = 0}}. As, generally, the zeros of a function cannot be computed exactly nor expressed in [[closed form expression|closed form]], root-finding algorithms provide approximations to zeros, expressed either as [[floating-point arithmetic|floating-point]] numbers or as small isolating [[interval (mathematics)|intervals]],  or [[disk (mathematics)|disks]] for complex roots (an interval or disk output being equivalent to an approximate output together with an error bound).<ref>{{refbegin}}
+*{{Cite book |last1=Press |first1=W. H. |title=Numerical Recipes: The Art of Scientific Computing |last2=Teukolsky |first2=S. A. |last3=Vetterling |first3=W. T. |last4=Flannery |first4=B. P. |publisher=Cambridge University Press |year=2007 |isbn=978-0-521-88068-8 |edition=3rd |publication-place=New York |chapter=Chapter 9. Root Finding and Nonlinear Sets of Equations |chapter-url=http://apps.nrbook.com/empanel/index.html#pg=442}}
+{{refend}}</ref>
 [[Equation solving|Solving an equation]] {{math|1=''f''(''x'') = ''g''(''x'')}} is the same as finding the roots of the function {{math|1=''h''(''x'') = ''f''(''x'') – ''g''(''x'')}}. Thus root-finding algorithms allow solving any [[equation (mathematics)|equation]] defined by continuous functions. However, most root-finding algorithms do not guarantee that they will find all the roots; in particular, if such an algorithm does not find any root, that does not mean that no root exists.
@@ Line 66: / Line 68: @@
 == Roots of polynomials {{anchor|Polynomials}} ==
 {{excerpt|Polynomial root-finding algorithms}}
+== Finding roots in higher dimensions ==
+<ref>{{Cite journal |last=Vrahatis |first=Michael N. |date=2020 |editor-last=Sergeyev |editor-first=Yaroslav D. |editor2-last=Kvasov |editor2-first=Dmitri E. |title=Generalizations of the Intermediate Value Theorem for Approximating Fixed Points and Zeros of Continuous Functions |url=https://link.springer.com/chapter/10.1007/978-3-030-40616-5_17 |journal=Numerical Computations: Theory and Algorithms |language=en |location=Cham |publisher=Springer International Publishing |pages=223–238 |doi=10.1007/978-3-030-40616-5_17 |isbn=978-3-030-40616-5}}</ref>
 == See also ==
@@ Line 85: / Line 90: @@
 == References ==
 {{reflist}}
-{{refbegin}}
-*{{Cite book |last1=Press |first1=W. H. |last2=Teukolsky |first2=S. A. |last3=Vetterling |first3=W. T. |last4=Flannery |first4=B. P. |year=2007 |title=Numerical Recipes: The Art of Scientific Computing |edition=3rd |publisher=Cambridge University Press |publication-place=New York |isbn=978-0-521-88068-8 |chapter=Chapter 9. Root Finding and Nonlinear Sets of Equations |chapter-url=http://apps.nrbook.com/empanel/index.html#pg=442}}
-{{refend}}
 == Further reading ==
 * J.M. McNamee: "Numerical Methods for Roots of Polynomials - Part I", Elsevier (2007).

v t e Root-finding algorithms
Bracketing (no derivative)	Bisection method Regula falsi ITP method
Householder	Newton's method Halley's method
Quasi-Newton	Broyden's method Secant method Newton–Krylov method Steffensen's method
Hybrid methods	Brent's method Ridders' method
Polynomial methods	Aberth method Bairstow's method Durand–Kerner method Graeffe's method Jenkins–Traub algorithm Lehmer–Schur algorithm Laguerre's method Splitting circle method
Other methods	Fixed-point iteration Inverse quadratic interpolation Muller's method Sidi's generalized secant method