Interior point method

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Interior point methods (also referred to as barrier methods) are a certain class of algorithms to solve linear and nonlinear convex optimization problems.

Example solution

The interior point method was invented by John von Neumann.[1] Von Neumann suggested a new method of linear programming, using the homogeneous linear system of Gordan (1873) which was later popularized by Karmarkar's algorithm, developed by Narendra Karmarkar in 1984 for linear programming. The method consists of a self-concordant barrier function used to encode the convex set. Contrary to the simplex method, it reaches an optimal solution by traversing the interior of the feasible region.

Any convex optimization problem can be transformed into minimizing (or maximizing) a linear function over a convex set by converting to the epigraph form.[2] The idea of encoding the feasible set using a barrier and designing barrier methods was studied by Anthony V. Fiacco, Garth P. McCormick, and others in the early 1960s. These ideas were mainly developed for general nonlinear programming, but they were later abandoned due to the presence of more competitive methods for this class of problems (e.g. sequential quadratic programming).

Yurii Nesterov and Arkadi Nemirovski came up with a special class of such barriers that can be used to encode any convex set. They guarantee that the number of iterations of the algorithm is bounded by a polynomial in the dimension and accuracy of the solution.[3]

Karmarkar's breakthrough revitalized the study of interior point methods and barrier problems, showing that it was possible to create an algorithm for linear programming characterized by polynomial complexity and, moreover, that was competitive with the simplex method. Already Khachiyan's ellipsoid method was a polynomial time algorithm; however, in practice it was too slow to be of practical interest.

The class of primal-dual path-following interior point methods is considered the most successful. Mehrotra's predictor-corrector algorithm provides the basis for most implementations of this class of methods[citation needed].

Primal-dual interior point method for nonlinear optimization[edit]

The primal-dual method's idea is easy to demonstrate for constrained nonlinear optimization. For simplicity consider the all-inequality version of a nonlinear optimization problem:

minimize f(x) subject to c(x) \ge 0~~ x \in \mathbb{R}^n, c(x) \in \mathbb{R}^m~~~~~~(1).

The logarithmic barrier function associated with (1) is

B(x,\mu) = f(x) - \mu~  \sum_{i=1}^m\ln(c_i(x))~~~~~(2)

Here \mu is a small positive scalar, sometimes called the "barrier parameter". As \mu converges to zero the minimum of B(x,\mu) should converge to a solution of (1).

The barrier function gradient is

g_b = g - \mu\sum_{i=1}^m \frac{1}{c_i(x)} \nabla c_i(x)~~~~~~(3)

where g is the gradient of the original function f(x) and \nabla c_i is the gradient of c_i.

In addition to the original ("primal") variable x we introduce a Lagrange multiplier inspired dual variable \lambda\in \mathbb{R} ^m(sometimes called "slack variable")

\forall_{i=1}^m c_i(x) \lambda_i=\mu~~~~~~~(4)

(4) is sometimes called the "perturbed complementarity" condition, for its resemblance to "complementary slackness" in KKT conditions.

We try to find those (x_\mu, \lambda_\mu) for which the gradient of the barrier function is zero.

Applying (4) to (3) we get equation for gradient:

g - A^T \lambda = 0~~~~~~(5)

where the matrix A is the constraint c(x) Jacobian.

The intuition behind (5) is that the gradient of f(x) should lie in the subspace spanned by the constraints' gradients. The "perturbed complementarity" with small \mu (4) can be understood as the condition that the solution should either lie near the boundary c_i(x) = 0 or that the projection of the gradient g on the constraint component c_i(x) normal should be almost zero.

Applying Newton's method to (4) and (5) we get an equation for (x, \lambda) update (p_x, p_\lambda):

\begin{pmatrix}
 W & -A^T \\
 \Lambda A & C
\end{pmatrix}\begin{pmatrix}
 p_x  \\
 p_\lambda
\end{pmatrix}=\begin{pmatrix}
 -g + A^T \lambda  \\
 \mu 1 - C \lambda
\end{pmatrix}

where W is the Hessian matrix of f(x) and \Lambda is a diagonal matrix of \lambda and C is a diagonal matrix where C_{ii} is c_i(x).

Because of (1), (4) the condition

\lambda \ge 0

should be enforced at each step. This can be done by choosing appropriate \alpha:

(x,\lambda) \rightarrow (x+ \alpha p_x, \lambda + \alpha p_\lambda).

See also[edit]

References[edit]

  1. ^ Dantzig, George B.; Thapa, Mukund N. (2003). Linear Programming 2: Theory and Extensions. Springer-Verlag. 
  2. ^ Boyd, Stephen; Vandenberghe, Lieven (2004). Convex Optimization. Cambridge: Cambridge University Press. p. 143. ISBN 0-521-83378-7. MR 2061575. 
  3. ^ Wright, Margaret H. (2004). "The interior-point revolution in optimization: History, recent developments, and lasting consequences". Bulletin of the American Mathematical Society 42: 39. doi:10.1090/S0273-0979-04-01040-7. MR 2115066. 

Bibliography[edit]