Descent direction

In optimization, a descent direction is a vector p R n {\displaystyle \mathbf {p} \in \mathbb {R} ^{n}} that points towards a local minimum x {\displaystyle \mathbf {x} ^{*}} of an objective function f : R n R {\displaystyle f:\mathbb {R} ^{n}\to \mathbb {R} } .

Computing x {\displaystyle \mathbf {x} ^{*}} by an iterative method, such as line search defines a descent direction p k R n {\displaystyle \mathbf {p} _{k}\in \mathbb {R} ^{n}} at the k {\displaystyle k} th iterate to be any p k {\displaystyle \mathbf {p} _{k}} such that p k , f ( x k ) < 0 {\displaystyle \langle \mathbf {p} _{k},\nabla f(\mathbf {x} _{k})\rangle <0} , where , {\displaystyle \langle ,\rangle } denotes the inner product. The motivation for such an approach is that small steps along p k {\displaystyle \mathbf {p} _{k}} guarantee that f {\displaystyle \displaystyle f} is reduced, by Taylor's theorem.

Using this definition, the negative of a non-zero gradient is always a descent direction, as f ( x k ) , f ( x k ) = f ( x k ) , f ( x k ) < 0 {\displaystyle \langle -\nabla f(\mathbf {x} _{k}),\nabla f(\mathbf {x} _{k})\rangle =-\langle \nabla f(\mathbf {x} _{k}),\nabla f(\mathbf {x} _{k})\rangle <0} .

Numerous methods exist to compute descent directions, all with differing merits, such as gradient descent or the conjugate gradient method.

More generally, if P {\displaystyle P} is a positive definite matrix, then p k = P f ( x k ) {\displaystyle p_{k}=-P\nabla f(x_{k})} is a descent direction at x k {\displaystyle x_{k}} .[1] This generality is used in preconditioned gradient descent methods.

See also

  • Directional derivative

References

  1. ^ J. M. Ortega and W. C. Rheinbold (1970). Iterative Solution of Nonlinear Equations in Several Variables. p. 243. doi:10.1137/1.9780898719468.