Symmetric rank-one

The Symmetric Rank 1 (SR1) method is a quasi-Newton method to update the second derivative (Hessian) based on the derivatives (gradients) calculated at two points. It is a generalization to the secant method for a multidimensional problem. This update maintains the symmetry of the matrix but does not guarantee that the update be positive definite.

The sequence of Hessian approximations generated by the SR1 method converges to the true Hessian under mild conditions, in theory; in practice, the approximate Hessians generated by the SR1 method show faster progress towards the true Hessian than do popular alternatives (BFGS or DFP), in preliminary numerical experiments.[1][2] The SR1 method has computational advantages for sparse or partially separable problems.[3]

A twice continuously differentiable function x f ( x ) {\displaystyle x\mapsto f(x)} has a gradient ( f {\displaystyle \nabla f} ) and Hessian matrix B {\displaystyle B} : The function f {\displaystyle f} has an expansion as a Taylor series at x 0 {\displaystyle x_{0}} , which can be truncated

f ( x 0 + Δ x ) f ( x 0 ) + f ( x 0 ) T Δ x + 1 2 Δ x T B Δ x {\displaystyle f(x_{0}+\Delta x)\approx f(x_{0})+\nabla f(x_{0})^{T}\Delta x+{\frac {1}{2}}\Delta x^{T}{B}\Delta x} ;

its gradient has a Taylor-series approximation also

f ( x 0 + Δ x ) f ( x 0 ) + B Δ x {\displaystyle \nabla f(x_{0}+\Delta x)\approx \nabla f(x_{0})+B\Delta x} ,

which is used to update B {\displaystyle B} . The above secant-equation need not have a unique solution B {\displaystyle B} . The SR1 formula computes (via an update of rank 1) the symmetric solution that is closest[further explanation needed] to the current approximate-value B k {\displaystyle B_{k}} :

B k + 1 = B k + ( y k B k Δ x k ) ( y k B k Δ x k ) T ( y k B k Δ x k ) T Δ x k {\displaystyle B_{k+1}=B_{k}+{\frac {(y_{k}-B_{k}\Delta x_{k})(y_{k}-B_{k}\Delta x_{k})^{T}}{(y_{k}-B_{k}\Delta x_{k})^{T}\Delta x_{k}}}} ,

where

y k = f ( x k + Δ x k ) f ( x k ) {\displaystyle y_{k}=\nabla f(x_{k}+\Delta x_{k})-\nabla f(x_{k})} .

The corresponding update to the approximate inverse-Hessian H k = B k 1 {\displaystyle H_{k}=B_{k}^{-1}} is

H k + 1 = H k + ( Δ x k H k y k ) ( Δ x k H k y k ) T ( Δ x k H k y k ) T y k {\displaystyle H_{k+1}=H_{k}+{\frac {(\Delta x_{k}-H_{k}y_{k})(\Delta x_{k}-H_{k}y_{k})^{T}}{(\Delta x_{k}-H_{k}y_{k})^{T}y_{k}}}} .

One might wonder why positive-definiteness is not preserved — after all, a rank-1 update of the form B k + 1 = B k + v v T {\displaystyle B_{k+1}=B_{k}+vv^{T}} is positive-definite if B k {\displaystyle B_{k}} is. The explanation is that the update might be of the form B k + 1 = B k v v T {\displaystyle B_{k+1}=B_{k}-vv^{T}} instead because the denominator can be negative, and in that case there are no guarantees about positive-definiteness.

The SR1 formula has been rediscovered a number of times. A drawback is that the denominator can vanish. Some authors have suggested that the update be applied only if

| Δ x k T ( y k B k Δ x k ) | r Δ x k y k B k Δ x k {\displaystyle |\Delta x_{k}^{T}(y_{k}-B_{k}\Delta x_{k})|\geq r\|\Delta x_{k}\|\cdot \|y_{k}-B_{k}\Delta x_{k}\|} ,

where r ( 0 , 1 ) {\displaystyle r\in (0,1)} is a small number, e.g. 10 8 {\displaystyle 10^{-8}} .[4]

See also

References

  1. ^ Conn, A. R.; Gould, N. I. M.; Toint, Ph. L. (March 1991). "Convergence of quasi-Newton matrices generated by the symmetric rank one update". Mathematical Programming. 50 (1). Springer Berlin/ Heidelberg: 177–195. doi:10.1007/BF01594934. ISSN 0025-5610. S2CID 28028770.
  2. ^ Khalfan, H. Fayez; et al. (1993). "A Theoretical and Experimental Study of the Symmetric Rank-One Update". SIAM Journal on Optimization. 3 (1): 1–24. doi:10.1137/0803001.
  3. ^ Byrd, Richard H.; et al. (1996). "Analysis of a Symmetric Rank-One Trust Region Method". SIAM Journal on Optimization. 6 (4): 1025–1039. doi:10.1137/S1052623493252985.
  4. ^ Nocedal, Jorge; Wright, Stephen J. (1999). Numerical Optimization. Springer. ISBN 0-387-98793-2.
  • v
  • t
  • e
Optimization: Algorithms, methods, and heuristics
Unconstrained nonlinear
Functions
Gradients
Convergence
Quasi–Newton
Other methods
Hessians
Graph of a strictly concave quadratic function with unique maximum.
Optimization computes maxima and minima.
General
Differentiable
Convex
minimization
Linear and
quadratic
Interior point
Basis-exchange
Paradigms
Graph
algorithms
Minimum
spanning tree
Shortest path
Network flows