Kantorovich theorem

About the convergence of Newton's method

The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local convergence of Newton's method. It was first stated by Leonid Kantorovich in 1948.^[1]^[2] It is similar to the form of the Banach fixed-point theorem, although it states existence and uniqueness of a zero rather than a fixed point.^[3]

Newton's method constructs a sequence of points that under certain conditions will converge to a solution $x$ of an equation $f(x)=0$ or a vector solution of a system of equation $F(x)=0$ . The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.^[1]^[2]

Assumptions

Let $X\subset \mathbb {R} ^{n}$ be an open subset and $F:X\subset \mathbb {R} ^{n}\to \mathbb {R} ^{n}$ a differentiable function with a Jacobian $F^{\prime }(\mathbf {x} )$ that is locally Lipschitz continuous (for instance if $F$ is twice differentiable). That is, it is assumed that for any $x\in X$ there is an open subset $U\subset X$ such that $x\in U$ and there exists a constant $L>0$ such that for any $\mathbf {x} ,\mathbf {y} \in U$

\|F'(\mathbf {x} )-F'(\mathbf {y} )\|\leq L\;\|\mathbf {x} -\mathbf {y} \|

holds. The norm on the left is the operator norm. In other words, for any vector $\mathbf {v} \in \mathbb {R} ^{n}$ the inequality

\|F'(\mathbf {x} )(\mathbf {v} )-F'(\mathbf {y} )(\mathbf {v} )\|\leq L\;\|\mathbf {x} -\mathbf {y} \|\,\|\mathbf {v} \|

must hold.

Now choose any initial point $\mathbf {x} _{0}\in X$ . Assume that $F'(\mathbf {x} _{0})$ is invertible and construct the Newton step $\mathbf {h} _{0}=-F'(\mathbf {x} _{0})^{-1}F(\mathbf {x} _{0}).$

The next assumption is that not only the next point $\mathbf {x} _{1}=\mathbf {x} _{0}+\mathbf {h} _{0}$ but the entire ball $B(\mathbf {x} _{1},\|\mathbf {h} _{0}\|)$ is contained inside the set $X$ . Let $M$ be the Lipschitz constant for the Jacobian over this ball (assuming it exists).

As a last preparation, construct recursively, as long as it is possible, the sequences $(\mathbf {x} _{k})_{k}$ , $(\mathbf {h} _{k})_{k}$ , $(\alpha _{k})_{k}$ according to

{\begin{alignedat}{2}\mathbf {h} _{k}&=-F'(\mathbf {x} _{k})^{-1}F(\mathbf {x} _{k})\\[0.4em]\alpha _{k}&=M\,\|F'(\mathbf {x} _{k})^{-1}\|\,\|\mathbf {h} _{k}\|\\[0.4em]\mathbf {x} _{k+1}&=\mathbf {x} _{k}+\mathbf {h} _{k}.\end{alignedat}}

Statement

Now if $\alpha _{0}\leq {\tfrac {1}{2}}$ then

a solution $\mathbf {x} ^{*}$ of $F(\mathbf {x} ^{*})=0$ exists inside the closed ball ${\bar {B}}(\mathbf {x} _{1},\|\mathbf {h} _{0}\|)$ and
the Newton iteration starting in $\mathbf {x} _{0}$ converges to $\mathbf {x} ^{*}$ with at least linear order of convergence.

A statement that is more precise but slightly more difficult to prove uses the roots $t^{\ast }\leq t^{**}$ of the quadratic polynomial

p(t)=\left({\tfrac {1}{2}}L\|F'(\mathbf {x} _{0})^{-1}\|^{-1}\right)t^{2}-t+\|\mathbf {h} _{0}\|

t^{\ast /**}={\frac {2\|\mathbf {h} _{0}\|}{1\pm {\sqrt {1-2\alpha _{0}}}}}

and their ratio

\theta ={\frac {t^{*}}{t^{**}}}={\frac {1-{\sqrt {1-2\alpha _{0}}}}{1+{\sqrt {1-2\alpha _{0}}}}}.

Then

a solution $\mathbf {x} ^{*}$ exists inside the closed ball ${\bar {B}}(\mathbf {x} _{1},\theta \|\mathbf {h} _{0}\|)\subset {\bar {B}}(\mathbf {x} _{0},t^{*})$
it is unique inside the bigger ball $B(\mathbf {x} _{0},t^{*\ast })$
and the convergence to the solution of $F$ is dominated by the convergence of the Newton iteration of the quadratic polynomial $p(t)$ towards its smallest root $t^{\ast }$ ,^[4] if $t_{0}=0,\,t_{k+1}=t_{k}-{\tfrac {p(t_{k})}{p'(t_{k})}}$ , then
$\|\mathbf {x} _{k+p}-\mathbf {x} _{k}\|\leq t_{k+p}-t_{k}.$
The quadratic convergence is obtained from the error estimate^[5]
$\|\mathbf {x} _{n+1}-\mathbf {x} ^{*}\|\leq \theta ^{2^{n}}\|\mathbf {x} _{n+1}-\mathbf {x} _{n}\|\leq {\frac {\theta ^{2^{n}}}{2^{n}}}\|\mathbf {h} _{0}\|.$

Corollary

In 1986, Yamamoto proved that the error evaluations of the Newton method such as Doring (1969), Ostrowski (1971, 1973),^[6]^[7] Gragg-Tapia (1974), Potra-Ptak (1980),^[8] Miel (1981),^[9] Potra (1984),^[10] can be derived from the Kantorovich theorem.^[11]

Generalizations

There is a q-analog for the Kantorovich theorem.^[12]^[13] For other generalizations/variations, see Ortega & Rheinboldt (1970).^[14]

Applications

Oishi and Tanabe claimed that the Kantorovich theorem can be applied to obtain reliable solutions of linear programming.^[15]

References

^ ^a ^b Deuflhard, P. (2004). Newton Methods for Nonlinear Problems. Affine Invariance and Adaptive Algorithms. Springer Series in Computational Mathematics. Vol. 35. Berlin: Springer. ISBN 3-540-21099-7.
^ ^a ^b Zeidler, E. (1985). Nonlinear Functional Analysis and its Applications: Part 1: Fixed-Point Theorems. New York: Springer. ISBN 0-387-96499-1.
^ Dennis, John E.; Schnabel, Robert B. (1983). "The Kantorovich and Contractive Mapping Theorems". Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Englewood Cliffs: Prentice-Hall. pp. 92–94. ISBN 0-13-627216-9.
^ Ortega, J. M. (1968). "The Newton-Kantorovich Theorem". Amer. Math. Monthly. 75 (6): 658–660. doi:10.2307/2313800. JSTOR 2313800.
^ Gragg, W. B.; Tapia, R. A. (1974). "Optimal Error Bounds for the Newton-Kantorovich Theorem". SIAM Journal on Numerical Analysis. 11 (1): 10–13. Bibcode:1974SJNA...11...10G. doi:10.1137/0711002. JSTOR 2156425.
^ Ostrowski, A. M. (1971). "La method de Newton dans les espaces de Banach". C. R. Acad. Sci. Paris. 27 (A): 1251–1253.
^ Ostrowski, A. M. (1973). Solution of Equations in Euclidean and Banach Spaces. New York: Academic Press. ISBN 0-12-530260-6.
^ Potra, F. A.; Ptak, V. (1980). "Sharp error bounds for Newton's process". Numer. Math. 34: 63–72. doi:10.1007/BF01463998.
^ Miel, G. J. (1981). "An updated version of the Kantorovich theorem for Newton's method". Computing. 27 (3): 237–244. doi:10.1007/BF02237981.
^ Potra, F. A. (1984). "On the a posteriori error estimates for Newton's method". Beiträge zur Numerische Mathematik. 12: 125–138.
^ Yamamoto, T. (1986). "A method for finding sharp error bounds for Newton's method under the Kantorovich assumptions". Numerische Mathematik. 49 (2–3): 203–220. doi:10.1007/BF01389624.
^ Rajkovic, P. M.; Stankovic, M. S.; Marinkovic, S. D. (2003). "On q-iterative methods for solving equations and systems". Novi Sad J. Math. 33 (2): 127–137.
^ Rajković, P. M.; Marinković, S. D.; Stanković, M. S. (2005). "On q-Newton–Kantorovich method for solving systems of equations". Applied Mathematics and Computation. 168 (2): 1432–1448. doi:10.1016/j.amc.2004.10.035.
^ Ortega, J. M.; Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. SIAM. OCLC 95021.
^ Oishi, S.; Tanabe, K. (2009). "Numerical Inclusion of Optimum Point for Linear Programming". JSIAM Letters. 1: 5–8. doi:10.14495/jsiaml.1.5.