# Kantorovich theorem

The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local convergence of Newton's method. It was first stated by Leonid Kantorovich in 1948.[1][2] It is similar to the form of the Banach fixed-point theorem, although it states existence and uniqueness of a zero rather than a fixed point.[3]

Newton's method constructs a sequence of points that under certain conditions will converge to a solution ${\displaystyle x}$ of an equation ${\displaystyle f(x)=0}$ or a vector solution of a system of equation ${\displaystyle F(x)=0}$. The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.[1][2]

## Assumptions

Let ${\displaystyle X\subset \mathbb {R} ^{n}}$ be an open subset and ${\displaystyle F:X\subset \mathbb {R} ^{n}\to \mathbb {R} ^{n}}$ a differentiable function with a Jacobian ${\displaystyle F^{\prime }(\mathbf {x} )}$ that is locally Lipschitz continuous (for instance if ${\displaystyle F}$ is twice differentiable). That is, it is assumed that for any ${\displaystyle x\in X}$ there is an open subset ${\displaystyle U\subset X}$ such that ${\displaystyle x\in U}$ and there exists a constant ${\displaystyle L>0}$ such that for any ${\displaystyle \mathbf {x} ,\mathbf {y} \in U}$

${\displaystyle \|F'(\mathbf {x} )-F'(\mathbf {y} )\|\leq L\;\|\mathbf {x} -\mathbf {y} \|}$

holds. The norm on the left is the operator norm. In other words, for any vector ${\displaystyle \mathbf {v} \in \mathbb {R} ^{n}}$ the inequality

${\displaystyle \|F'(\mathbf {x} )(\mathbf {v} )-F'(\mathbf {y} )(\mathbf {v} )\|\leq L\;\|\mathbf {x} -\mathbf {y} \|\,\|\mathbf {v} \|}$

must hold.

Now choose any initial point ${\displaystyle \mathbf {x} _{0}\in X}$. Assume that ${\displaystyle F'(\mathbf {x} _{0})}$ is invertible and construct the Newton step ${\displaystyle \mathbf {h} _{0}=-F'(\mathbf {x} _{0})^{-1}F(\mathbf {x} _{0}).}$

The next assumption is that not only the next point ${\displaystyle \mathbf {x} _{1}=\mathbf {x} _{0}+\mathbf {h} _{0}}$ but the entire ball ${\displaystyle B(\mathbf {x} _{1},\|\mathbf {h} _{0}\|)}$ is contained inside the set ${\displaystyle X}$. Let ${\displaystyle M}$ be the Lipschitz constant for the Jacobian over this ball (assuming it exists).

As a last preparation, construct recursively, as long as it is possible, the sequences ${\displaystyle (\mathbf {x} _{k})_{k}}$, ${\displaystyle (\mathbf {h} _{k})_{k}}$, ${\displaystyle (\alpha _{k})_{k}}$ according to

{\displaystyle {\begin{alignedat}{2}\mathbf {h} _{k}&=-F'(\mathbf {x} _{k})^{-1}F(\mathbf {x} _{k})\\[0.4em]\alpha _{k}&=M\,\|F'(\mathbf {x} _{k})^{-1}\|\,\|\mathbf {h} _{k}\|\\[0.4em]\mathbf {x} _{k+1}&=\mathbf {x} _{k}+\mathbf {h} _{k}.\end{alignedat}}}

## Statement

Now if ${\displaystyle \alpha _{0}\leq {\tfrac {1}{2}}}$ then

1. a solution ${\displaystyle \mathbf {x} ^{*}}$ of ${\displaystyle F(\mathbf {x} ^{*})=0}$ exists inside the closed ball ${\displaystyle {\bar {B}}(\mathbf {x} _{1},\|\mathbf {h} _{0}\|)}$ and
2. the Newton iteration starting in ${\displaystyle \mathbf {x} _{0}}$ converges to ${\displaystyle \mathbf {x} ^{*}}$ with at least linear order of convergence.

A statement that is more precise but slightly more difficult to prove uses the roots ${\displaystyle t^{\ast }\leq t^{**}}$ of the quadratic polynomial

${\displaystyle p(t)=\left({\tfrac {1}{2}}L\|F'(\mathbf {x} _{0})^{-1}\|^{-1}\right)t^{2}-t+\|\mathbf {h} _{0}\|}$,
${\displaystyle t^{\ast /**}={\frac {2\|\mathbf {h} _{0}\|}{1\pm {\sqrt {1-2\alpha _{0}}}}}}$

and their ratio

${\displaystyle \theta ={\frac {t^{*}}{t^{**}}}={\frac {1-{\sqrt {1-2\alpha _{0}}}}{1+{\sqrt {1-2\alpha _{0}}}}}.}$

Then

1. a solution ${\displaystyle \mathbf {x} ^{*}}$ exists inside the closed ball ${\displaystyle {\bar {B}}(\mathbf {x} _{1},\theta \|\mathbf {h} _{0}\|)\subset {\bar {B}}(\mathbf {x} _{0},t^{*})}$
2. it is unique inside the bigger ball ${\displaystyle B(\mathbf {x} _{0},t^{*\ast })}$
3. and the convergence to the solution of ${\displaystyle F}$ is dominated by the convergence of the Newton iteration of the quadratic polynomial ${\displaystyle p(t)}$ towards its smallest root ${\displaystyle t^{\ast }}$,[4] if ${\displaystyle t_{0}=0,\,t_{k+1}=t_{k}-{\tfrac {p(t_{k})}{p'(t_{k})}}}$, then
${\displaystyle \|\mathbf {x} _{k+p}-\mathbf {x} _{k}\|\leq t_{k+p}-t_{k}.}$
4. The quadratic convergence is obtained from the error estimate[5]
${\displaystyle \|\mathbf {x} _{n+1}-\mathbf {x} ^{*}\|\leq \theta ^{2^{n}}\|\mathbf {x} _{n+1}-\mathbf {x} _{n}\|\leq {\frac {\theta ^{2^{n}}}{2^{n}}}\|\mathbf {h} _{0}\|.}$

## Corollary

In 1986, Yamamoto proved that the error evaluations of the Newton method such as Doring (1969), Ostrowski (1971, 1973),[6][7] Gragg-Tapia (1974), Potra-Ptak (1980),[8] Miel (1981),[9] Potra (1984),[10] can be derived from the Kantorovich theorem.[11]

## Generalizations

There is a q-analog for the Kantorovich theorem.[12][13] For other generalizations/variations, see Ortega & Rheinboldt (1970).[14]

## Applications

Oishi and Tanabe claimed that the Kantorovich theorem can be applied to obtain reliable solutions of linear programming.[15]

## References

1. ^ a b Deuflhard, P. (2004). Newton Methods for Nonlinear Problems. Affine Invariance and Adaptive Algorithms. Springer Series in Computational Mathematics. Vol. 35. Berlin: Springer. ISBN 3-540-21099-7.
2. ^ a b Zeidler, E. (1985). Nonlinear Functional Analysis and its Applications: Part 1: Fixed-Point Theorems. New York: Springer. ISBN 0-387-96499-1.
3. ^ Dennis, John E.; Schnabel, Robert B. (1983). "The Kantorovich and Contractive Mapping Theorems". Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Englewood Cliffs: Prentice-Hall. pp. 92–94. ISBN 0-13-627216-9.
4. ^ Ortega, J. M. (1968). "The Newton-Kantorovich Theorem". Amer. Math. Monthly. 75 (6): 658–660. doi:10.2307/2313800. JSTOR 2313800.
5. ^ Gragg, W. B.; Tapia, R. A. (1974). "Optimal Error Bounds for the Newton-Kantorovich Theorem". SIAM Journal on Numerical Analysis. 11 (1): 10–13. Bibcode:1974SJNA...11...10G. doi:10.1137/0711002. JSTOR 2156425.
6. ^ Ostrowski, A. M. (1971). "La method de Newton dans les espaces de Banach". C. R. Acad. Sci. Paris. 27 (A): 1251–1253.
7. ^ Ostrowski, A. M. (1973). Solution of Equations in Euclidean and Banach Spaces. New York: Academic Press. ISBN 0-12-530260-6.
8. ^ Potra, F. A.; Ptak, V. (1980). "Sharp error bounds for Newton's process". Numer. Math. 34: 63–72. doi:10.1007/BF01463998.
9. ^ Miel, G. J. (1981). "An updated version of the Kantorovich theorem for Newton's method". Computing. 27 (3): 237–244. doi:10.1007/BF02237981.
10. ^ Potra, F. A. (1984). "On the a posteriori error estimates for Newton's method". Beiträge zur Numerische Mathematik. 12: 125–138.
11. ^ Yamamoto, T. (1986). "A method for finding sharp error bounds for Newton's method under the Kantorovich assumptions". Numerische Mathematik. 49 (2–3): 203–220. doi:10.1007/BF01389624.
12. ^ Rajkovic, P. M.; Stankovic, M. S.; Marinkovic, S. D. (2003). "On q-iterative methods for solving equations and systems". Novi Sad J. Math. 33 (2): 127–137.
13. ^ Rajković, P. M.; Marinković, S. D.; Stanković, M. S. (2005). "On q-Newton–Kantorovich method for solving systems of equations". Applied Mathematics and Computation. 168 (2): 1432–1448. doi:10.1016/j.amc.2004.10.035.
14. ^ Ortega, J. M.; Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. SIAM. OCLC 95021.
15. ^ Oishi, S.; Tanabe, K. (2009). "Numerical Inclusion of Optimum Point for Linear Programming". JSIAM Letters. 1: 5–8. doi:10.14495/jsiaml.1.5.