# Convex optimization

Convex optimization is a subfield of optimization that studies the problem of minimizing convex functions over convex sets. The convexity makes optimization easier than the general case since a local minimum must be a global minimum, and first-order conditions are sufficient conditions for optimality.[1]

Convex minimization has applications in a wide range of disciplines, such as automatic control systems, estimation and signal processing, communications and networks, electronic circuit design,[2] data analysis and modeling, finance, statistics (optimal experimental design),[3] and structural optimization.[4]

With recent advancements in computing, optimization theory, and convex analysis, convex minimization is nearly as straightforward as linear programming.

Many optimization problems can be reformulated as convex minimization problems. For example, the problem of maximizing a concave function ${\displaystyle f}$ can be re-formulated equivalently as a problem of minimizing the convex function ${\displaystyle -f}$.

## Definition

An optimization problem, which aims to find a point ${\displaystyle \mathbf {x^{\ast }} }$ such that ${\textstyle f(\mathbf {x^{\ast }} )\leq f(\mathbf {x} )}$ for all ${\displaystyle \mathbf {x} }$ in some feasible set ${\displaystyle {\mathcal {X}}}$, is convex if and only if ${\displaystyle {\mathcal {X}}}$ and the objective function ${\displaystyle f}$ satisfy the following two properties:

1. ${\displaystyle {\mathcal {X}}}$ is a convex subset of a real vector space ${\displaystyle X}$ (for any two vectors/points ${\displaystyle \mathbf {x} ,\mathbf {y} \in {\mathcal {X}}}$ and ${\displaystyle t\in [0,1]}$, ${\displaystyle \mathbf {y} +t(\mathbf {x} -\mathbf {y} )=t\mathbf {x} +(1-t)\mathbf {y} }$ is also in ${\displaystyle {\mathcal {X}}}$).
2. ${\displaystyle f:{\mathcal {X}}\to \mathbb {R} }$ is a convex function from this convex subset to the real numbers; i.e., for any ${\displaystyle \mathbf {x} ,\mathbf {y} \in {\mathcal {X}}}$ and ${\displaystyle t\in [0,1]}$,
${\displaystyle f(t\mathbf {x} +(1-t)\mathbf {y} )\leq tf(\mathbf {x} )+(1-t)f(\mathbf {y} ).}$

### In terms of the general definition of optimization

Perhaps more conveniently, the convex problem can be phrased in the more shorthand general definition of optimization: finding some ${\displaystyle \mathbf {x^{\ast }} \in {\mathcal {X}}}$ such that

${\displaystyle f(\mathbf {x^{\ast }} )=\min\{f(\mathbf {x} ):\mathbf {x} \in {\mathcal {X}}\},}$

for some feasible set ${\displaystyle {\mathcal {X}}}$ and objective function ${\displaystyle f:{\mathcal {X}}\to \mathbb {R} }$. This is the general definition of an optimization problem — the above definition does not guarantee a convex optimization problem. For an optimization problem to be convex under the above, ${\displaystyle {\mathcal {X}}}$ must also be a convex set. [5] [6]

### Inequality-constrained optimization problems

An optimization problem of the form

{\displaystyle {\begin{aligned}&{\underset {\mathbf {x} }{\operatorname {minimize} }}&&f(\mathbf {x} )\\&\operatorname {subject\ to} &&g_{i}(\mathbf {x} )\leq 0,\quad i=1,\dots ,m\end{aligned}}}

is a convex optimization if the functions ${\displaystyle f,g_{1},\ldots ,g_{m}:X\rightarrow \mathbb {R} }$ are all convex over their domains.[7]

#### Proof

By assumption that ${\displaystyle f}$ is convex, we know that property 1 is satisfied.

For property 2, first observe that each constraint ${\displaystyle g_{i}(\mathbf {x} )\leq 0}$ by itself forms a convex set. (This set is called a sublevel set of ${\displaystyle g_{i}}$.) By definition, if ${\displaystyle \mathbf {x} }$ and ${\displaystyle \mathbf {y} }$ are in the set, ${\displaystyle g_{i}(\mathbf {x} )\leq 0}$ and ${\displaystyle g_{i}(\mathbf {y} )\leq 0}$. Since ${\displaystyle g_{i}}$ is convex, ${\displaystyle g_{i}(t\mathbf {x} +(1-t)\mathbf {y} )\leq tg_{i}(\mathbf {x} )+(1-t)g_{i}(\mathbf {y} )}$. This expression is at most 0 since ${\displaystyle t\geq 0}$ and ${\displaystyle 1-t\geq 0}$, and ${\displaystyle t\mathbf {x} +(1-t)\mathbf {y} }$ is in the set.

Next, observe that the intersection of convex sets ${\displaystyle {\mathcal {X_{1}}}\cap \cdots \cap {\mathcal {X_{m}}}}$ is itself convex: By definition ${\displaystyle \mathbf {x} \in {\mathcal {X_{1}}}\cap \cdots \cap {\mathcal {X_{m}}}}$ if and only if ${\displaystyle \mathbf {x} \in {\mathcal {X_{1}}},\mathbf {x} \in {\mathcal {X_{2}}},\ldots ,\mathbf {x} \in {\mathcal {X_{m}}}}$. So for ${\displaystyle \mathbf {x} ,\mathbf {y} \in {\mathcal {X_{1}}}\cap \cdots \cap {\mathcal {X_{m}}}}$, ${\displaystyle \mathbf {x} \in {\mathcal {X_{1}}},\ldots ,\mathbf {x} \in {\mathcal {X_{m}}}}$ and ${\displaystyle \mathbf {y} \in {\mathcal {X_{1}}},\ldots ,\mathbf {y} \in {\mathcal {X_{m}}}}$. By the convexity of the sets ${\displaystyle {\mathcal {X_{i}}}}$, for ${\displaystyle t\in [0,1]}$, ${\displaystyle t\mathbf {x} +(1-t)\mathbf {y} \in {\mathcal {X_{i}}}}$. Since this is true for all i, ${\displaystyle t\mathbf {x} +(1-t)\mathbf {y} \in {\mathcal {X_{1}}}\cap \cdots \cap {\mathcal {X_{m}}}}$.

Since we found that each constraint alone imposes a convex feasible set, and that the intersection of convex sets is convex, the above form of optimization problem is convex.

### Standard form (inequality- and equality-constrained)

Standard form is the usual and perhaps most intuitive form of describing a convex minimization problem. It consists of the following three parts:

• A convex objective function ${\displaystyle f:{\mathcal {X}}\to \mathbb {R} }$ to be minimized over the input ${\displaystyle \mathbf {x} }$
• A series of inequality constraints of the form ${\displaystyle g_{i}(\mathbf {x} )\leq 0}$, with each function ${\displaystyle g_{i}}$ convex
• A series of equality constraints of the form ${\displaystyle h_{i}(\mathbf {x} )=0}$, with each function ${\displaystyle h_{i}}$ both convex and concave, or in geometric terms, affine
• In practice, "affine" is often used interchangeably with "linear," but "affine" is more general and thus more powerful.
• If ${\displaystyle {\mathcal {X}}\subset \mathbb {R} ^{n}}$, the equality constraints can each be expressed as ${\displaystyle h_{i}(\mathbf {x} )=\mathbf {a_{i}} ^{T}\mathbf {x} +b_{i}}$, with ${\displaystyle \mathbf {a_{i}} }$ a column-vector, the superscript ${\displaystyle T}$ indicating the transpose operation for vectors in ${\displaystyle \mathbb {R} }$, and ${\displaystyle b_{i}}$ a real number. You can think of ${\displaystyle \mathbf {a_{i}} ^{T}\mathbf {x} }$ as shorthand for ${\displaystyle a_{i,1}x_{1}+\cdots +a_{i,n}x_{n}}$.

A convex minimization problem in standard form is thus written as

{\displaystyle {\begin{aligned}&{\underset {\mathbf {x} }{\operatorname {minimize} }}&&f(\mathbf {x} )\\&\operatorname {subject\ to} &&g_{i}(\mathbf {x} )\leq 0,\quad i=1,\dots ,m\\&&&h_{i}(\mathbf {x} )=0,\quad i=1,\dots ,p.\end{aligned}}}

In reality, this form of problem is exactly equivalent to a problem constrained by only equalities. You can rephrase the problem without equality constraints by replacing each equality constraint ${\displaystyle h_{i}(\mathbf {x} )=0}$ with a pair of inequality constraints ${\displaystyle h_{i}(\mathbf {x} )\leq 0}$ and ${\displaystyle -h_{i}(\mathbf {x} )\leq 0}$, and the previous analysis holds. This fact is why ${\displaystyle h_{i}}$ being both convex and concave, i.e. both ${\displaystyle h_{i}}$ and ${\displaystyle -h_{i}}$ are convex, implies the above form is a convex optimization problem.

Writing equality constraints instead of twice as many inequality constraints is useful as a shorthand.

## Theory

The following statements are true about the convex minimization problem:

• if a local minimum exists, then it is a global minimum.
• the set of all (global) minima is convex.
• for each strictly convex function, if the function has a minimum, then the minimum is unique.

These results are used by the theory of convex minimization along with geometric notions from functional analysis (in Hilbert spaces) such as the Hilbert projection theorem, the separating hyperplane theorem, and Farkas' lemma.

## Examples

The following problems are all convex minimization problems, or can be transformed into convex minimizations problems via a change of variables:

## Lagrange multipliers

Consider a convex minimization problem given in standard form by a cost function ${\displaystyle f(x)}$ and inequality constraints ${\displaystyle g_{i}(x)\leq 0}$ for ${\displaystyle 1\leq i\leq m}$. Then the domain ${\displaystyle {\mathcal {X}}}$ is:

${\displaystyle {\mathcal {X}}=\left\{x\in X\vert g_{1}(x),\ldots ,g_{m}(x)\leq 0\right\}.}$

The Lagrangian function for the problem is

${\displaystyle L(x,\lambda _{0},\lambda _{1},\ldots ,\lambda _{m})=\lambda _{0}f(x)+\lambda _{1}g_{1}(x)+\cdots +\lambda _{m}g_{m}(x).}$

For each point ${\displaystyle x}$ in ${\displaystyle X}$ that minimizes ${\displaystyle f}$ over ${\displaystyle X}$, there exist real numbers ${\displaystyle \lambda _{0},\lambda _{1},\ldots ,\lambda _{m},}$ called Lagrange multipliers, that satisfy these conditions simultaneously:

1. ${\displaystyle x}$ minimizes ${\displaystyle L(y,\lambda _{0},\lambda _{1},\ldots ,\lambda _{m})}$ over all ${\displaystyle y\in X,}$
2. ${\displaystyle \lambda _{0},\lambda _{1},\ldots ,\lambda _{m}\geq 0,}$ with at least one ${\displaystyle \lambda _{k}>0,}$
3. ${\displaystyle \lambda _{1}g_{1}(x)=\cdots =\lambda _{m}g_{m}(x)=0}$ (complementary slackness).

If there exists a "strictly feasible point", that is, a point ${\displaystyle z}$ satisfying

${\displaystyle g_{1}(z),\ldots ,g_{m}(z)<0,}$

then the statement above can be strengthened to require that ${\displaystyle \lambda _{0}=1}$.

Conversely, if some ${\displaystyle x}$ in ${\displaystyle X}$ satisfies (1)–(3) for scalars ${\displaystyle \lambda _{0},\ldots ,\lambda _{m}}$ with ${\displaystyle \lambda _{0}=1}$ then ${\displaystyle x}$ is certain to minimize ${\displaystyle f}$ over ${\displaystyle X}$.

## Methods of solving

Convex minimization problems can be solved by the following contemporary methods:[8]

Other methods of interest:

Subgradient methods can be implemented simply and so are widely used.[9] Dual subgradient methods are subgradient methods applied to a dual problem. The drift-plus-penalty method is similar to the dual subgradient method, but takes a time average of the primal variables.

## Convex minimization with good complexity: Self-concordant barriers

The efficiency of iterative methods is poor for the class of convex problems, because this class includes "bad guys" whose minimum cannot be approximated without a large number of function and subgradient evaluations;[10] thus, to have practically appealing efficiency results, it is necessary to make additional restrictions on the class of problems. Two such classes are problems special barrier functions, first self-concordant barrier functions, according to the theory of Nesterov and Nemirovskii, and second self-regular barrier functions according to the theory of Terlaky and coauthors.

## Quasiconvex minimization

Problems with convex level sets can be efficiently minimized, in theory. Yurii Nesterov proved that quasi-convex minimization problems could be solved efficiently, and his results were extended by Kiwiel.[11] However, such theoretically "efficient" methods use "divergent-series" stepsize rules, which were first developed for classical subgradient methods. Classical subgradient methods using divergent-series rules are much slower than modern methods of convex minimization, such as subgradient projection methods, bundle methods of descent, and nonsmooth filter methods.

Solving even close-to-convex but non-convex problems can be computationally intractable. Minimizing a unimodal function is intractable, regardless of the smoothness of the function, according to results of Ivanov.[12]

## Convex maximization

Conventionally, the definition of the convex optimization problem (we recall) requires that the objective function f to be minimized and the feasible set be convex. In the special case of linear programming (LP), the objective function is both concave and convex, and so LP can also consider the problem of maximizing an objective function without confusion. However, for most convex minimization problems, the objective function is not concave, and therefore a problem and then such problems are formulated in the standard form of convex optimization problems, that is, minimizing the convex objective function.

For nonlinear convex minimization, the associated maximization problem obtained by substituting the supremum operator for the infimum operator is not a problem of convex optimization, as conventionally defined. However, it is studied in the larger field of convex optimization as a problem of convex maximization.[13]

The convex maximization problem is especially important for studying the existence of maxima. Consider the restriction of a convex function to a compact convex set: Then, on that set, the function attains its constrained maximum only on the boundary.[14] Such results, called "maximum principles", are useful in the theory of harmonic functions, potential theory, and partial differential equations.

The problem of minimizing a quadratic multivariate polynomial on a cube is NP-hard.[15] In fact, in the quadratic minimization problem, if the matrix has only one negative eigenvalue, is NP-hard.[16]

## Extensions

Advanced treatments consider convex functions that can attain positive infinity, also; the indicator function of convex analysis is zero for every ${\displaystyle x\in {\mathcal {X}}}$ and positive infinity otherwise.

Extensions of convex functions include biconvex, pseudo-convex, and quasi-convex functions. Partial extensions of the theory of convex analysis and iterative methods for approximately solving non-convex minimization problems occur in the field of generalized convexity ("abstract convex analysis").

## Notes

1. ^ Rockafellar, R. Tyrrell (1993). "Lagrange multipliers and optimality" (PDF). SIAM Review. 35 (2): 183–238. doi:10.1137/1035044.
2. ^ Boyd/Vandenberghe, p. 17.
3. ^ Chritensen/Klarbring, chapter 4.
4. ^ Boyd/Vandenberghe, chapter 7.
5. ^ Hiriart-Urruty, Jean-Baptiste; Lemaréchal, Claude (1996). Convex analysis and minimization algorithms: Fundamentals. p. 291.
6. ^ Ben-Tal, Aharon; Nemirovskiĭ, Arkadiĭ Semenovich (2001). Lectures on modern convex optimization: analysis, algorithms, and engineering applications. pp. 335–336.
7. ^ Boyd/Vandenberghe, p. 7
8. ^ For methods for convex minimization, see the volumes by Hiriart-Urruty and Lemaréchal (bundle) and the textbooks by Ruszczyński, Bertsekas, and Boyd and Vandenberghe (interior point).
9. ^ Bertsekas
10. ^ Hiriart-Urruty & Lemaréchal (1993, Example XV.1.1.2, p. 277) discuss a "bad guy" constructed by Arkadi Nemirovskii.
11. ^ In theory, quasiconvex programming and convex programming problems can be solved in reasonable amount of time, where the number of iterations grows like a polynomial in the dimension of the problem (and in the reciprocal of the approximation error tolerated):

Kiwiel, Krzysztof C. (2001). "Convergence and efficiency of subgradient methods for quasiconvex minimization". Mathematical Programming (Series A). 90 (1). Berlin, Heidelberg: Springer. pp. 1–25. doi:10.1007/PL00011414. ISSN 0025-5610. MR 1819784. Kiwiel acknowledges that Yurii Nesterov first established that quasiconvex minimization problems can be solved efficiently.

12. ^ Nemirovskii and Judin
13. ^ Convex maximization is mentioned in the subsection on convex optimization in this textbook: Ulrich Faigle, Walter Kern, and George Still. Algorithmic principles of mathematical programming. Springer-Verlag. Texts in Mathematics. Chapter 10.2, Subsection "Convex optimization", pages 205-206.
14. ^ Theorem 32.1 in Rockafellar's Convex Analysis states this maximum principle for extended real-valued functions.
15. ^ Sahni, S. "Computationally related problems," in SIAM Journal on Computing, 3, 262--279, 1974.
16. ^ Quadratic programming with one negative eigenvalue is NP-hard, Panos M. Pardalos and Stephen A. Vavasis in Journal of Global Optimization, Volume 1, Number 1, 1991, pg.15-22.

## References

• Borwein, Jonathan, and Lewis, Adrian. (2000). Convex Analysis and Nonlinear Optimization. Springer.
• Hiriart-Urruty, Jean-Baptiste, and Lemaréchal, Claude. (2004). Fundamentals of Convex analysis. Berlin: Springer.
• Hiriart-Urruty, Jean-Baptiste; Lemaréchal, Claude (1993). Convex analysis and minimization algorithms, Volume I: Fundamentals. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. 305. Berlin: Springer-Verlag. pp. xviii+417. ISBN 3-540-56850-6. MR 1261420.
• Hiriart-Urruty, Jean-Baptiste; Lemaréchal, Claude (1993). Convex analysis and minimization algorithms, Volume II: Advanced theory and bundle methods. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. 306. Berlin: Springer-Verlag. pp. xviii+346. ISBN 3-540-56852-2. MR 1295240.
• Kiwiel, Krzysztof C. (1985). Methods of Descent for Nondifferentiable Optimization. Lecture Notes in Mathematics. New York: Springer-Verlag. ISBN 978-3-540-15642-0.
• Lemaréchal, Claude (2001). "Lagrangian relaxation". In Michael Jünger and Denis Naddef. Computational combinatorial optimization: Papers from the Spring School held in Schloß Dagstuhl, May 15–19, 2000. Lecture Notes in Computer Science. 2241. Berlin: Springer-Verlag. pp. 112–156. doi:10.1007/3-540-45586-8_4. ISBN 3-540-42877-1. MR 1900016.
• Nesterov, Y. and Nemirovsky, A. (1994). 'Interior Point Polynomial Methods in Convex Programming. SIAM
• Nesterov, Yurii. (2004). Introductory Lectures on Convex Optimization, Kluwer Academic Publishers
• Rockafellar, R. T. (1970). Convex analysis. Princeton: Princeton University Press.