Kantorovich theorem

About the convergence of Newton's method

The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local convergence of Newton's method. It was first stated by Leonid Kantorovich in 1948.[1][2] It is similar to the form of the Banach fixed-point theorem, although it states existence and uniqueness of a zero rather than a fixed point.[3]

Newton's method constructs a sequence of points that under certain conditions will converge to a solution x {\displaystyle x} of an equation f ( x ) = 0 {\displaystyle f(x)=0} or a vector solution of a system of equation F ( x ) = 0 {\displaystyle F(x)=0} . The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.[1][2]

Assumptions

Let X R n {\displaystyle X\subset \mathbb {R} ^{n}} be an open subset and F : X R n R n {\displaystyle F:X\subset \mathbb {R} ^{n}\to \mathbb {R} ^{n}} a differentiable function with a Jacobian F ( x ) {\displaystyle F^{\prime }(\mathbf {x} )} that is locally Lipschitz continuous (for instance if F {\displaystyle F} is twice differentiable). That is, it is assumed that for any x X {\displaystyle x\in X} there is an open subset U X {\displaystyle U\subset X} such that x U {\displaystyle x\in U} and there exists a constant L > 0 {\displaystyle L>0} such that for any x , y U {\displaystyle \mathbf {x} ,\mathbf {y} \in U}

F ( x ) F ( y ) L x y {\displaystyle \|F'(\mathbf {x} )-F'(\mathbf {y} )\|\leq L\;\|\mathbf {x} -\mathbf {y} \|}

holds. The norm on the left is the operator norm. In other words, for any vector v R n {\displaystyle \mathbf {v} \in \mathbb {R} ^{n}} the inequality

F ( x ) ( v ) F ( y ) ( v ) L x y v {\displaystyle \|F'(\mathbf {x} )(\mathbf {v} )-F'(\mathbf {y} )(\mathbf {v} )\|\leq L\;\|\mathbf {x} -\mathbf {y} \|\,\|\mathbf {v} \|}

must hold.

Now choose any initial point x 0 X {\displaystyle \mathbf {x} _{0}\in X} . Assume that F ( x 0 ) {\displaystyle F'(\mathbf {x} _{0})} is invertible and construct the Newton step h 0 = F ( x 0 ) 1 F ( x 0 ) . {\displaystyle \mathbf {h} _{0}=-F'(\mathbf {x} _{0})^{-1}F(\mathbf {x} _{0}).}

The next assumption is that not only the next point x 1 = x 0 + h 0 {\displaystyle \mathbf {x} _{1}=\mathbf {x} _{0}+\mathbf {h} _{0}} but the entire ball B ( x 1 , h 0 ) {\displaystyle B(\mathbf {x} _{1},\|\mathbf {h} _{0}\|)} is contained inside the set X {\displaystyle X} . Let M {\displaystyle M} be the Lipschitz constant for the Jacobian over this ball (assuming it exists).

As a last preparation, construct recursively, as long as it is possible, the sequences ( x k ) k {\displaystyle (\mathbf {x} _{k})_{k}} , ( h k ) k {\displaystyle (\mathbf {h} _{k})_{k}} , ( α k ) k {\displaystyle (\alpha _{k})_{k}} according to

h k = F ( x k ) 1 F ( x k ) α k = M F ( x k ) 1 h k x k + 1 = x k + h k . {\displaystyle {\begin{alignedat}{2}\mathbf {h} _{k}&=-F'(\mathbf {x} _{k})^{-1}F(\mathbf {x} _{k})\\[0.4em]\alpha _{k}&=M\,\|F'(\mathbf {x} _{k})^{-1}\|\,\|\mathbf {h} _{k}\|\\[0.4em]\mathbf {x} _{k+1}&=\mathbf {x} _{k}+\mathbf {h} _{k}.\end{alignedat}}}

Statement

Now if α 0 1 2 {\displaystyle \alpha _{0}\leq {\tfrac {1}{2}}} then

  1. a solution x {\displaystyle \mathbf {x} ^{*}} of F ( x ) = 0 {\displaystyle F(\mathbf {x} ^{*})=0} exists inside the closed ball B ¯ ( x 1 , h 0 ) {\displaystyle {\bar {B}}(\mathbf {x} _{1},\|\mathbf {h} _{0}\|)} and
  2. the Newton iteration starting in x 0 {\displaystyle \mathbf {x} _{0}} converges to x {\displaystyle \mathbf {x} ^{*}} with at least linear order of convergence.

A statement that is more precise but slightly more difficult to prove uses the roots t t {\displaystyle t^{\ast }\leq t^{**}} of the quadratic polynomial

p ( t ) = ( 1 2 L F ( x 0 ) 1 1 ) t 2 t + h 0 {\displaystyle p(t)=\left({\tfrac {1}{2}}L\|F'(\mathbf {x} _{0})^{-1}\|^{-1}\right)t^{2}-t+\|\mathbf {h} _{0}\|} ,
t / = 2 h 0 1 ± 1 2 α 0 {\displaystyle t^{\ast /**}={\frac {2\|\mathbf {h} _{0}\|}{1\pm {\sqrt {1-2\alpha _{0}}}}}}

and their ratio

θ = t t = 1 1 2 α 0 1 + 1 2 α 0 . {\displaystyle \theta ={\frac {t^{*}}{t^{**}}}={\frac {1-{\sqrt {1-2\alpha _{0}}}}{1+{\sqrt {1-2\alpha _{0}}}}}.}

Then

  1. a solution x {\displaystyle \mathbf {x} ^{*}} exists inside the closed ball B ¯ ( x 1 , θ h 0 ) B ¯ ( x 0 , t ) {\displaystyle {\bar {B}}(\mathbf {x} _{1},\theta \|\mathbf {h} _{0}\|)\subset {\bar {B}}(\mathbf {x} _{0},t^{*})}
  2. it is unique inside the bigger ball B ( x 0 , t ) {\displaystyle B(\mathbf {x} _{0},t^{*\ast })}
  3. and the convergence to the solution of F {\displaystyle F} is dominated by the convergence of the Newton iteration of the quadratic polynomial p ( t ) {\displaystyle p(t)} towards its smallest root t {\displaystyle t^{\ast }} ,[4] if t 0 = 0 , t k + 1 = t k p ( t k ) p ( t k ) {\displaystyle t_{0}=0,\,t_{k+1}=t_{k}-{\tfrac {p(t_{k})}{p'(t_{k})}}} , then
    x k + p x k t k + p t k . {\displaystyle \|\mathbf {x} _{k+p}-\mathbf {x} _{k}\|\leq t_{k+p}-t_{k}.}
  4. The quadratic convergence is obtained from the error estimate[5]
    x n + 1 x θ 2 n x n + 1 x n θ 2 n 2 n h 0 . {\displaystyle \|\mathbf {x} _{n+1}-\mathbf {x} ^{*}\|\leq \theta ^{2^{n}}\|\mathbf {x} _{n+1}-\mathbf {x} _{n}\|\leq {\frac {\theta ^{2^{n}}}{2^{n}}}\|\mathbf {h} _{0}\|.}

Corollary

In 1986, Yamamoto proved that the error evaluations of the Newton method such as Doring (1969), Ostrowski (1971, 1973),[6][7] Gragg-Tapia (1974), Potra-Ptak (1980),[8] Miel (1981),[9] Potra (1984),[10] can be derived from the Kantorovich theorem.[11]

Generalizations

There is a q-analog for the Kantorovich theorem.[12][13] For other generalizations/variations, see Ortega & Rheinboldt (1970).[14]

Applications

Oishi and Tanabe claimed that the Kantorovich theorem can be applied to obtain reliable solutions of linear programming.[15]

References

  1. ^ a b Deuflhard, P. (2004). Newton Methods for Nonlinear Problems. Affine Invariance and Adaptive Algorithms. Springer Series in Computational Mathematics. Vol. 35. Berlin: Springer. ISBN 3-540-21099-7.
  2. ^ a b Zeidler, E. (1985). Nonlinear Functional Analysis and its Applications: Part 1: Fixed-Point Theorems. New York: Springer. ISBN 0-387-96499-1.
  3. ^ Dennis, John E.; Schnabel, Robert B. (1983). "The Kantorovich and Contractive Mapping Theorems". Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Englewood Cliffs: Prentice-Hall. pp. 92–94. ISBN 0-13-627216-9.
  4. ^ Ortega, J. M. (1968). "The Newton-Kantorovich Theorem". Amer. Math. Monthly. 75 (6): 658–660. doi:10.2307/2313800. JSTOR 2313800.
  5. ^ Gragg, W. B.; Tapia, R. A. (1974). "Optimal Error Bounds for the Newton-Kantorovich Theorem". SIAM Journal on Numerical Analysis. 11 (1): 10–13. Bibcode:1974SJNA...11...10G. doi:10.1137/0711002. JSTOR 2156425.
  6. ^ Ostrowski, A. M. (1971). "La method de Newton dans les espaces de Banach". C. R. Acad. Sci. Paris. 27 (A): 1251–1253.
  7. ^ Ostrowski, A. M. (1973). Solution of Equations in Euclidean and Banach Spaces. New York: Academic Press. ISBN 0-12-530260-6.
  8. ^ Potra, F. A.; Ptak, V. (1980). "Sharp error bounds for Newton's process". Numer. Math. 34: 63–72. doi:10.1007/BF01463998.
  9. ^ Miel, G. J. (1981). "An updated version of the Kantorovich theorem for Newton's method". Computing. 27 (3): 237–244. doi:10.1007/BF02237981.
  10. ^ Potra, F. A. (1984). "On the a posteriori error estimates for Newton's method". Beiträge zur Numerische Mathematik. 12: 125–138.
  11. ^ Yamamoto, T. (1986). "A method for finding sharp error bounds for Newton's method under the Kantorovich assumptions". Numerische Mathematik. 49 (2–3): 203–220. doi:10.1007/BF01389624.
  12. ^ Rajkovic, P. M.; Stankovic, M. S.; Marinkovic, S. D. (2003). "On q-iterative methods for solving equations and systems". Novi Sad J. Math. 33 (2): 127–137.
  13. ^ Rajković, P. M.; Marinković, S. D.; Stanković, M. S. (2005). "On q-Newton–Kantorovich method for solving systems of equations". Applied Mathematics and Computation. 168 (2): 1432–1448. doi:10.1016/j.amc.2004.10.035.
  14. ^ Ortega, J. M.; Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. SIAM. OCLC 95021.
  15. ^ Oishi, S.; Tanabe, K. (2009). "Numerical Inclusion of Optimum Point for Linear Programming". JSIAM Letters. 1: 5–8. doi:10.14495/jsiaml.1.5.

Further reading

  • John H. Hubbard and Barbara Burke Hubbard: Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach, Matrix Editions, ISBN 978-0-9715766-3-6 (preview of 3. edition and sample material including Kant.-thm.)
  • Yamamoto, Tetsuro (2001). "Historical Developments in Convergence Analysis for Newton's and Newton-like Methods". In Brezinski, C.; Wuytack, L. (eds.). Numerical Analysis : Historical Developments in the 20th Century. North-Holland. pp. 241–263. ISBN 0-444-50617-9.