# Sturm's theorem

In mathematics, the Sturm sequence of a univariate polynomial p is a sequence of polynomials associated with p and its derivative by a variant of Euclid's algorithm for polynomials. Sturm's theorem expresses the number of distinct real roots of p located in an interval in terms of the number of changes of signs of the values of the Sturm sequence at the bounds of the interval. Applied to the interval of all the real numbers, it gives the total number of real roots of p.

Whereas the fundamental theorem of algebra readily yields the overall number of complex roots, counted with multiplicity, it does not provide a procedure for calculating them. Sturm's theorem counts the number of distinct real roots and locates them in intervals. By subdividing the intervals containing some roots, it can isolate the roots into arbitrary small intervals, each containing exactly one root. This yields an arbitrary-precision numeric root finding algorithm for univariate polynomials.

The Sturm sequence and Sturm's theorem are named after Jacques Charles François Sturm.

## The theorem

The Sturm chain or Sturm sequence of a univariate polynomial P(x) with real coefficients is the sequence of polynomials ${\displaystyle P_{0},P_{1},\ldots ,}$ such that

{\displaystyle {\begin{aligned}P_{0}&=P,\\P_{1}&=P',\\P_{i+1}&=-\operatorname {rem} (P_{i-1},P_{i}),\end{aligned}}}

for i ≥ 1, where P' is the derivative of P, and ${\displaystyle \operatorname {rem} (P_{i-1},P_{i})}$ is the remainder of the Euclidean division of ${\displaystyle P_{i-1}}$ by ${\displaystyle P_{i}.}$ The length of the Sturm sequence is at most the degree of P.

The number of sign variations at ξ of the Sturm sequence of P is the number of sign changes–ignoring zeros—in the sequence of real numbers

${\displaystyle P_{0}(\xi ),P_{1}(\xi ),P_{2}(\xi ),\ldots .}$

This number of sign variations is denoted here V(ξ).

Sturm's theorem states that, if P is a square-free polynomial, the number of distinct real roots of P in the half-open interval (a, b] is V(a) − V(b) (here, a and b are real numbers such that a, b).

The theorem extends to unbounded intervals by defining the sign at +∞ of a polynomial as the sign of its leading coefficient (that is, the coefficient of the term of highest degree). At –∞ the sign of a polynomial is the sign of its leading coefficient for a polynomial of even degree, and the opposite sign for a polynomial of odd degree.

In the case of a non-square-free polynomial, if neither a nor b is a multiple root of p, then V(a) − V(b) is the number of distinct real roots of P.

The proof of the theorem is as follows: when the value of x increases from a to b, it may pass through a zero of some ${\displaystyle P_{i}}$ (i > 0); when this occurs, the number of sign variations of ${\displaystyle (P_{i-1},P_{i},P_{i+1})}$ does not changes. When x passes through a root of ${\displaystyle P_{0}=P,}$ the number of sign variations of ${\displaystyle (P_{0},P_{1})}$ decreases from 1 to 0. These are the only values of x where some sign may change.

## Example

Suppose we wish to find the number of roots in some range for the polynomial ${\displaystyle p(x)=x^{4}+x^{3}-x-1}$. So

{\displaystyle {\begin{aligned}p_{0}(x)&=p(x)=x^{4}+x^{3}-x-1\\p_{1}(x)&=p'(x)=4x^{3}+3x^{2}-1\end{aligned}}}

The remainder of the Euclidean division of p0 by p1 is ${\displaystyle -{\tfrac {3}{16}}x^{2}-{\tfrac {3}{4}}x-{\tfrac {15}{16}};}$ multiplying it by −1 we obtain

${\displaystyle p_{2}(x)={\tfrac {3}{16}}x^{2}+{\tfrac {3}{4}}x+{\tfrac {15}{16}}}$.

Next dividing p1 by p2 and multiplying the remainder by −1, we obtain

${\displaystyle p_{3}(x)=-32x-64}$.

Now dividing p2 by p3 and multiplying the remainder by −1, we obtain

${\displaystyle p_{4}(x)=-{\tfrac {3}{16}}}$.

As this is a constant, this finishes the computation of the Sturm sequence.

To find the number of real roots between of ${\displaystyle p_{0}}$ one has to evaluate the sequences of the signs of these polynomials at −∞ and , which are respectively (+, −, +, +, −) and (+, +, +, −, −). Thus

${\displaystyle V(-\infty )-V(+\infty )=3-1=2,}$

which shows that p has two real roots.

This can be verified by noting that p(x) can be factored as (x2 − 1)(x2 + x + 1), where the first factor has the roots −1 and 1, and second factor has no real roots. This last assertion results from the quadratic formula, and also from Sturm's theorem, which gives the sign sequences (+, –, –) at −∞ and (+, +, –) at +∞.

## Generalization

Sturm sequences have been generalized in two directions. For defining each polynomial in the sequence, Sturm used the opposite of the remainder of the Euclidean division of the two preceding ones. The theorem remains true if replace the opposite of the remainder by its product or quotient by a positive constant or the square of a polynomial. It is also useful (see below) to consider sequences where the second polynomial is not the derivative of the first one.

A generalized Sturm sequence is a finite sequence of polynomials with real coefficients

${\displaystyle P_{0},P_{1},\dots ,P_{m}}$

such that

• the degrees are decreasing after the first one: ${\displaystyle \deg P_{i}<\deg P_{i-1}}$ for i = 2, ..., m;
• ${\displaystyle P_{m}}$ does not have any real root or does not changes of sign near its real roots.
• if Pi(ξ) = 0 for 0 < i < m and ξ a real number, then Pi −1 (ξ) Pi + 1(ξ) < 0.

The original Sturm sequence is clearly a generalized Sturm sequence, when the polynomial has no multiple real root. The last condition implies that two consecutive polynomials do not have any common real root.

When computing the original Sturm sequence by Euclidean division, it may occur the one encounters a polynomial that has a factor that is never negative, such a ${\displaystyle x^{2}}$ or ${\displaystyle x^{2}+1}$. In this case, if one continues the computation with the polynomial replaced by its quotient by the nonnegative factor, one gets a generalized Sturm sequence, which may also be used for computing the number of real roots, since the proof of Sturm's theorem still applies (because of the third condition). This may sometimes simplify the computation, although it is generally difficult to find such nonnegative factors, except for even powers of x.

## Use of pseudo-remainder sequences

In computer algebra, the polynomials that are considered have integer coefficients or may be transformed for having integer coefficients. The Sturm sequence of a polynomial with integer coefficients contains generally polynomials whose coefficients are not integers (see above example).

For avoiding to compute with rational numbers, a common method is to replace Euclidean division by pseudo-division for computing polynomial greatest common divisors. This amounts to replace the remainder sequence of the Euclidean algorithm by a pseudo-remainder sequence, a pseudo remainder sequence being a sequence ${\displaystyle p_{0},\ldots ,p_{k}}$ of polynomials such that there are constants ${\displaystyle a_{i}}$ and ${\displaystyle b_{i}}$ such that ${\displaystyle b_{i}p_{i+1}}$ is the remainder of the Euclidean division of ${\displaystyle p_{i-1}}$ by ${\displaystyle p_{i}.}$ For example, the remainder sequence of the Euclidean algorithm is a pseudo-remainder sequence with ${\displaystyle a_{i}=b_{i}=1}$ for every i, and the Sturm sequence of a polynomial is a pseudo-remainder sequence with ${\displaystyle a_{i}=1}$ and ${\displaystyle b_{i}=-1}$ for every i.

Various pseudo-remainder sequences have bee designed for computing greatest common divisors of polynomials with integer coefficients without introducing denominators (see Pseudo-remainder sequence). They can all be made generalized Sturm sequences by choosing the sign of the ${\displaystyle b_{i}}$ for being the opposite of the sign of the${\displaystyle a_{i}.}$ This allows using Sturm's theorem with pseudo-remainder sequences.

## Root isolation

For a polynomial with real coefficients, root isolation consists of finding, for each real root, an interval that contains this root, and no other roots.

This is useful for root finding, as allowing the selection of the roots that are useful to compute and providing a good starting point for fast numerical algorithms such as Newton's method; it is also useful for certifying the result, as if Newton's method converge outside the interval one may immediately deduce that it converges to the wrong root.

Root isolation is also useful for computing with algebraic numbers. In fact, for computing with algebraic numbers, a common method is to represent them as a pair of a polynomial to which the algebraic number is a root, and an isolation interval. For example ${\displaystyle {\sqrt {2}}}$ may be unambiguously represented by ${\displaystyle (x^{2}-2,[0,2]).}$

Sturm's theorem provide a way for isolating real roots that is less efficient (for polynomials with integer coefficients) than other methods involving Descartes' rule of signs. However, it remains useful in some circumstances, mainly for theoretical purpose, for example for algorithms of real algebraic geometry that involve infinitesimals.

## History and related methods

For counting and isolating the real roots, other methods are usually preferred, because they are computationally more efficient; these methods all use Descartes' rule of signs (just like Fourier[1] did back in 1820) and Vincent's theorem. The very first one of those methods was initially called "modified Uspensky's algorithm" by its inventors,[2] but it was later shown that there is no Uspensky's method;[3] afterwards, people started calling it either "Collins–Akritas method"[4] or "Descartes' method" only to be shown that there is no Descartes' method[4] either. Finally, François Boulier, of the University of Lille,[5] p. 24, gave it the name "Vincent–Collins–Akritas" (VCA for short) to also give credit to Vincent. VCA is a bisection method; there exists also a continued fractions method based on Vincent's theorem namely, the Vincent–Akritas–Strzeboński (VAS) method.[6]

VAS is based on Budan's theorem whereas Sturm's method was inspired by Fourier's theorem. In fact Sturm himself,[7] p. 108, acknowledges the great influence Fourier's theorem had on him: « C'est en m'appuyant sur les principes qu'il a posés, et en imitant ses démonstrations, que j'ai trouvé les nouveaux théorèmes que je vais énoncer. » which translates to "It is by relying upon the principles he has laid out and by imitating his proofs that I have found the new theorems which I am about to announce."

## References

1. ^ Fourier, Jean Baptiste Joseph (1820). "Sur l'usage du théorème de Descartes dans la recherche des limites des racines". Bulletin des Sciences, par la Société Philomatique de Paris: 156–165.
2. ^ Collins, George E.; Alkiviadis G. Akritas (1976). Polynomial Real Root Isolation Using Descartes' Rule of Signs. SYMSAC '76, Proceedings of the third ACM symposium on Symbolic and algebraic computation. Yorktown Heights, NY, USA: ACM. pp. 272–275.
3. ^ Akritas, Alkiviadis G. (1986). There is no "Uspensky's Method". In: Proceedings of the fifth ACM Symposium on Symbolic and Algebraic Computation (SYMSAC '86, Waterloo, Ontario, Canada), pp. 88–90.
4. ^ a b Akritas, Alkiviadis G. (2008). There is no "Descartes' method" (PDF). In: M.J.Wester and M. Beaudin (Eds), Computer Algebra in Education, AullonaPress, USA, pp. 19–35.
5. ^ Boulier, François (2010). Systèmes polynomiaux : que signifie " résoudre " ? (PDF). Université Lille 1.
6. ^ Akritas, Alkiviadis G.; A.W. Strzeboński; P. S. Vigklas (2008). "Improving the performance of the continued fractions method using new bounds of positive roots" (PDF). Nonlinear Analysis: Modelling and Control. 13: 265–279.
7. ^ Hourya, Benis-Sinaceur (1988). "Deux moments dans l'histoire du Théorème d'algèbre de Ch. F. Sturm". In: Revue d'histoire des sciences. 41 (2): 99–132.
• Sturm, Jacques Charles François (1829). "Mémoire sur la résolution des équations numériques". Bulletin des Sciences de Férussac. 11: 419–425.
• Sylvester, J. J. (1853). "On a theory of the syzygetic relations of two rational integral functions, comprising an application to the theory of Sturm's functions, and that of the greatest algebraical common measure". Phil. Trans. R. Soc. Lond. 143: 407–548. doi:10.1098/rstl.1853.0018. JSTOR 108572.
• Thomas, Joseph Miller (1941). "Sturm's theorem for multiple roots". National Mathematics Magazine. 15 (8): 391–394. JSTOR 3028551. MR 0005945.
• Heindel, Lee E. (1971), "Integer arithmetic algorithms for polynomial real zero determination", Proc. SYMSAC '71: 415, doi:10.1145/800204.806312, MR 0300434
• Panton, Don B.; Verdini, William A. (1981). "A fortran program for applying Sturm's theorem in counting internal rates of return". J. Financ. Quant. Anal. 16 (3): 381–388. JSTOR 2330245.
• Akritas, Alkiviadis G. (1982). "Reflections on a pair of theorems by Budan and Fourier". Math. Mag. 55 (5): 292–298. doi:10.2307/2690097. JSTOR 2690097. MR 0678195.
• Petersen, Paul (1991). "Multivariate Sturm theory". Lecture Notes in Comp. Science. 539: 318–332. doi:10.1007/3-540-54522-0_120. MR 1229329.
• Yap, Chee (2000). Fundamental Problems in Algorithmic Algebra. Oxford University Press. ISBN 0-19-512516-9.
• Rahman, Q. I.; Schmeisser, G. (2002). Analytic theory of polynomials. London Mathematical Society Monographs. New Series. 26. Oxford: Oxford University Press. ISBN 0-19-853493-0. Zbl 1072.30006.
• Baumol, William. Economic Dynamics, chapter 12, Section 3, "Qualitative information on real roots"
• D.G. Hook and P. R. McAree, "Using Sturm Sequences To Bracket Real Roots of Polynomial Equations" in Graphic Gems I (A. Glassner ed.), Academic Press, pp. 416–422, 1990.