Carathéodory's theorem (convex hull)

From Wikipedia, the free encyclopedia
Jump to: navigation, search
An illustration of Carathéodory's theorem for a square in R2
See also Carathéodory's theorem (disambiguation) for other meanings

In convex geometry Carathéodory's theorem states that if a point x of Rd lies in the convex hull of a set P, there is a subset P′ of P consisting of d + 1 or fewer points such that x lies in the convex hull of P′. Equivalently, x lies in an r-simplex with vertices in P, where r \leq  d. The result is named for Constantin Carathéodory, who proved the theorem in 1911 for the case when P is compact. In 1914 Ernst Steinitz expanded Carathéodory's theorem for any sets P in Rd.

For example, consider a set P = {(0,0), (0,1), (1,0), (1,1)} which is a subset of R2. The convex hull of this set is a square. Consider now a point x = (1/4, 1/4), which is in the convex hull of P. We can then construct a set {(0,0),(0,1),(1,0)} = P′, the convex hull of which is a triangle and encloses x, and thus the theorem works for this instance, since |P′| = 3. It may help to visualise Carathéodory's theorem in 2 dimensions, as saying that we can construct a triangle consisting of points from P that encloses any point in P.


Let x be a point in the convex hull of P. Then, x is a convex combination of a finite number of points in P :

\mathbf{x}=\sum_{j=1}^k \lambda_j \mathbf{x}_j

where every xj is in P, every λj is non-negative, and \sum_{j=1}^k\lambda_j=1.

Suppose k > d + 1 (otherwise, there is nothing to prove). Then, the points x2 − x1, ..., xk − x1 are linearly dependent,

so there are real scalars μ2, ..., μk, not all zero, such that

\sum_{j=2}^k \mu_j (\mathbf{x}_j-\mathbf{x}_1)=\mathbf{0}.

If μ1 is defined as

\mu_1:=-\sum_{j=2}^k \mu_j


\sum_{j=1}^k \mu_j \mathbf{x}_j=\mathbf{0}
\sum_{j=1}^k \mu_j=0

and not all of the μj are equal to zero. Therefore, at least one μj > 0. Then,

\mathbf{x} = \sum_{j=1}^k \lambda_j \mathbf{x}_j-\alpha\sum_{j=1}^k \mu_j \mathbf{x}_j = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j

for any real α. In particular, the equality will hold if α is defined as

 \alpha:=\min_{1\leq j \leq k} \left\{ \tfrac{\lambda_j}{\mu_j}:\mu_j>0\right\}=\tfrac{\lambda_i}{\mu_i}.

Note that α>0, and for every j between 1 and k,

\lambda_j-\alpha\mu_j \geq 0.

In particular, λi − αμi = 0 by definition of α. Therefore,

\mathbf{x} = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j

where every \lambda_j - \alpha \mu_j is nonnegative, their sum is one , and furthermore, \lambda_i-\alpha\mu_i=0. In other words, x is represented as a convex combination of at most k-1 points of P. This process can be repeated until x is represented as a convex combination of at most d + 1 points in P.

An alternative proof uses Helly's theorem.

See also[edit]


External links[edit]