Sequential quadratic programming

Optimization algorithm

Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization which may be considered a quasi-Newton method. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable, but not necessarily convex.

SQP methods solve a sequence of optimization subproblems, each of which optimizes a quadratic model of the objective subject to a linearization of the constraints. If the problem is unconstrained, then the method reduces to Newton's method for finding a point where the gradient of the objective vanishes. If the problem has only equality constraints, then the method is equivalent to applying Newton's method to the first-order optimality conditions, or Karush–Kuhn–Tucker conditions, of the problem.

Algorithm basics

Overview schematic illustrating the basic SQP algorithm

Consider a nonlinear programming problem of the form:

min x f ( x ) subject to h ( x ) 0 g ( x ) = 0. {\displaystyle {\begin{array}{rl}\min \limits _{x}&f(x)\\{\mbox{subject to}}&h(x)\geq 0\\&g(x)=0.\end{array}}}

The Lagrangian for this problem is[1]

L ( x , λ , σ ) = f ( x ) + λ h ( x ) + σ g ( x ) , {\displaystyle {\mathcal {L}}(x,\lambda ,\sigma )=f(x)+\lambda h(x)+\sigma g(x),}

where λ {\displaystyle \lambda } and σ {\displaystyle \sigma } are Lagrange multipliers.

The standard Newton's Method searches for the solution L ( x , λ , σ ) = 0 {\displaystyle \nabla {\mathcal {L}}(x,\lambda ,\sigma )=0} by iterating the following equation, where x x 2 {\displaystyle \nabla _{xx}^{2}} denotes the Hessian matrix:

[ x k + 1 λ k + 1 σ k + 1 ] = [ x k λ k σ k ] [ x x 2 L h g h T 0 0 g T 0 0 ] 1 2 L [ f + λ k h + σ k g h g ] L {\displaystyle {\begin{bmatrix}x_{k+1}\\\lambda _{k+1}\\\sigma _{k+1}\end{bmatrix}}={\begin{bmatrix}x_{k}\\\lambda _{k}\\\sigma _{k}\end{bmatrix}}-\underbrace {{\begin{bmatrix}\nabla _{xx}^{2}{\mathcal {L}}&\nabla h&\nabla g\\\nabla h^{T}&0&0\\\nabla g^{T}&0&0\end{bmatrix}}^{-1}} _{\nabla ^{2}{\mathcal {L}}}\underbrace {\begin{bmatrix}\nabla f+\lambda _{k}\nabla h+\sigma _{k}\nabla g\\h\\g\end{bmatrix}} _{\nabla {\mathcal {L}}}} .

However, because the matrix 2 L {\displaystyle \nabla ^{2}{\mathcal {L}}} is generally singular (and therefore non-invertible), the Newton step d k = ( x x 2 L ) 1 L {\displaystyle d_{k}=\left(\nabla _{xx}^{2}{\mathcal {L}}\right)^{-1}\nabla {\mathcal {L}}} cannot be calculated directly. Instead the basic sequential quadratic programming algorithm defines an appropriate search direction d k {\displaystyle d_{k}} at an iterate ( x k , λ k , σ k ) {\displaystyle (x_{k},\lambda _{k},\sigma _{k})} , as a solution to the quadratic programming subproblem

min d f ( x k ) + f ( x k ) T d + 1 2 d T x x 2 L ( x k , λ k , σ k ) d s . t . h ( x k ) + h ( x k ) T d 0 g ( x k ) + g ( x k ) T d = 0. {\displaystyle {\begin{array}{rl}\min \limits _{d}&f(x_{k})+\nabla f(x_{k})^{T}d+{\tfrac {1}{2}}d^{T}\nabla _{xx}^{2}{\mathcal {L}}(x_{k},\lambda _{k},\sigma _{k})d\\\mathrm {s.t.} &h(x_{k})+\nabla h(x_{k})^{T}d\geq 0\\&g(x_{k})+\nabla g(x_{k})^{T}d=0.\end{array}}}

where the quadratic form is formed with the Hessian of the Lagrangian. Note that the term f ( x k ) {\displaystyle f(x_{k})} in the expression above may be left out for the minimization problem, since it is constant under the min d {\displaystyle \min \limits _{d}} operator.

Together, the SQP algorithm starts by first choosing the initial iterate ( x 0 , λ 0 , σ 0 ) {\displaystyle (x_{0},\lambda _{0},\sigma _{0})} , then calculating 2 L ( x 0 , λ 0 , σ 0 ) {\displaystyle \nabla ^{2}{\mathcal {L}}(x_{0},\lambda _{0},\sigma _{0})} and L ( x 0 , λ 0 , σ 0 ) {\displaystyle \nabla {\mathcal {L}}(x_{0},\lambda _{0},\sigma _{0})} . Then the QP subproblem is built and solved to find the Newton step direction d 0 {\displaystyle d_{0}} which is used to update the parent problem iterate using [ x k + 1 , λ k + 1 , σ k + 1 ] T = [ x k , λ k , σ k ] T + d k {\displaystyle \left[x_{k+1},\lambda _{k+1},\sigma _{k+1}\right]^{T}=\left[x_{k},\lambda _{k},\sigma _{k}\right]^{T}+d_{k}} . This process is repeated for k = 0 , 1 , 2 , {\displaystyle k=0,1,2,\ldots } until the parent problem satisfies a convergence test.

Practical implementations

Practical implementations of the SQP algorithm are significantly more complex than its basic version above. To adapt SQP for real-world applications, the following challenges must be addressed:

  • The possibility of an infeasible QP subproblem.
  • QP subproblem yielding an bad step: one that either fails to reduce the target or increases constraints violation.
  • Breakdown of iterations due to significant deviation of the target/constraints from their quadratic/linear models.

To overcome these challenges, various strategies are typically employed:

  • Use of merit functions, which assess progress towards a constrained solution, or filter methods.
  • Trust region or line search methods to manage deviations between the quadratic model and the actual target.
  • Special feasibility restoration phases to handle infeasible subproblems, or the use of L1-penalized subproblems to gradually decrease infeasibility

These strategies can be combined in numerous ways, resulting in a diverse range of SQP methods.

Alternative approaches

  • Sequential linear programming
  • Sequential linear-quadratic programming
  • Augmented Lagrangian method

Implementations

SQP methods have been implemented in well known numerical environments such as MATLAB and GNU Octave. There also exist numerous software libraries, including open source:

  • SciPy (de facto standard for scientific Python) has scipy.optimize.minimize(method='SLSQP') solver.
  • NLopt (C/C++ implementation, with numerous interfaces including Julia, Python, R, MATLAB/Octave), implemented by Dieter Kraft as part of a package for optimal control, and modified by S. G. Johnson.[2][3]
  • ALGLIB SQP solver (C++, C#, Java, Python API)

and commercial

See also

Notes

  1. ^ Jorge Nocedal and Stephen J. Wright (2006). Numerical Optimization. Springer. ISBN 978-0-387-30303-1.
  2. ^ Kraft, Dieter (Sep 1994). "Algorithm 733: TOMP–Fortran modules for optimal control calculations". ACM Transactions on Mathematical Software. 20 (3): 262–281. CiteSeerX 10.1.1.512.2567. doi:10.1145/192115.192124. S2CID 16077051. Retrieved 1 February 2019.
  3. ^ "NLopt Algorithms: SLSQP". Read the Docs. July 1988. Retrieved 1 February 2019.
  4. ^ KNITRO User Guide: Algorithms

References

  • Bonnans, J. Frédéric; Gilbert, J. Charles; Lemaréchal, Claude; Sagastizábal, Claudia A. (2006). Numerical optimization: Theoretical and practical aspects. Universitext (Second revised ed. of translation of 1997 French ed.). Berlin: Springer-Verlag. pp. xiv+490. doi:10.1007/978-3-540-35447-5. ISBN 978-3-540-35445-1. MR 2265882.
  • Jorge Nocedal and Stephen J. Wright (2006). Numerical Optimization. Springer. ISBN 978-0-387-30303-1.

External links

  • Sequential Quadratic Programming at NEOS guide
  • v
  • t
  • e
Functions
Gradients
Convergence
Quasi–Newton
Other methods
Hessians
Graph of a strictly concave quadratic function with unique maximum.
Optimization computes maxima and minima.
General
Differentiable
Convex
minimization
Linear and
quadratic
Interior point
Basis-exchange
Paradigms
Graph
algorithms
Minimum
spanning tree
Shortest path
Network flows