Center-of-gravity method

The center-of-gravity method is a theoretic algorithm for convex optimization. It can be seen as a generalization of the bisection method from one-dimensional functions to multi-dimensional functions.[1]: Sec.8.2.2  It is theoretically important as it attains the optimal convergence rate. However, it has little practical value as each step is very computationally expensive.

Input

Our goal is to solve a convex optimization problem of the form:

minimize f(x) s.t. x in G,

where f is a convex function, and G is a convex subset of a Euclidean space Rn.

We assume that we have a "subgradient oracle": a routine that can compute a subgradient of f at any given point (if f is differentiable, then the only subgradient is the gradient f {\displaystyle \nabla f} ; but we do not assume that f is differentiable).

Method

The method is iterative. At each iteration t, we keep a convex region Gt, which surely contains the desired minimum. Initially we have G0 = G. Then, each iteration t proceeds as follows.

  • Let xt be the center of gravity of Gt.
  • Compute a subgradient at xt, denoted f'(xt).
    • By definition of a subgradient, the graph of f is above the subgradient, so for all x in Gt: f(x)−f(xt) ≥ (xxt)Tf'(xt).
  • If f'(xt)=0, then the above implies that xt is an exact minimum point, so we terminate and return xt.
  • Otherwise, let Gt+1 := {x in Gt: (xxt)Tf'(xt) ≤ 0}.

Note that, by the above inequality, every minimum point of f must be in Gt+1.[1]: Sec.8.2.2 

Convergence

It can be proved that

V o l u m e ( G t + 1 ) [ 1 ( n n + 1 ) n ] V o l u m e ( G t ) {\displaystyle Volume(G_{t+1})\leq \left[1-\left({\frac {n}{n+1}}\right)^{n}\right]\cdot Volume(G_{t})} .

Therefore,

f ( x t ) min G f [ 1 ( n n + 1 ) n ] t / n [ max G f min G f ] {\displaystyle f(x_{t})-\min _{G}f\leq \left[1-\left({\frac {n}{n+1}}\right)^{n}\right]^{t/n}[\max _{G}f-\min _{G}f]} .

In other words, the method has linear convergence of the residual objective value, with convergence rate [ 1 ( n n + 1 ) n ] 1 / n ( 1 1 / e ) 1 / n {\displaystyle \left[1-\left({\frac {n}{n+1}}\right)^{n}\right]^{1/n}\leq (1-1/e)^{1/n}} . To get an ε-approximation to the objective value, the number of required steps is at most 2.13 n ln ( 1 / ϵ ) + 1 {\displaystyle 2.13n\ln(1/\epsilon )+1} .[1]: Sec.8.2.2 

Computational complexity

The main problem with the method is that, in each step, we have to compute the center-of-gravity of a polytope. All the methods known so far for this problem require a number of arithmetic operations that is exponential in the dimension n.[1]: Sec.8.2.2  Therefore, the method is not useful in practice when there are 5 or more dimensions.

See also

The ellipsoid method can be seen as a tractable approximation to the center-of-gravity method. Instead of maintaining the feasible polytope Gt, it maintains an ellipsoid that contains it. Computing the center-of-gravity of an ellipsoid is much easier than of a general polytope, and hence the ellipsoid method can usually be computed in polynomial time.

References

  1. ^ a b c d Nemirovsky and Ben-Tal (2023). "Optimization III: Convex Optimization" (PDF).