Carathéodory's theorem (convex hull)

An illustration of Carathéodory's theorem for a square in R2

In convex geometry Carathéodory's theorem states that if a point x of Rd lies in the convex hull of a set P, there is a subset P′ of P consisting of d + 1 or fewer points such that x lies in the convex hull of P′. Equivalently, x lies in an r-simplex with vertices in P, where $r \leq d$. The result is named for Constantin Carathéodory, who proved the theorem in 1911 for the case when P is compact. In 1914 Ernst Steinitz expanded Carathéodory's theorem for any sets P in Rd.

For example, consider a set P = {(0,0), (0,1), (1,0), (1,1)} which is a subset of R2. The convex hull of this set is a square. Consider now a point x = (1/4, 1/4), which is in the convex hull of P. We can then construct a set {(0,0),(0,1),(1,0)} = P′, the convex hull of which is a triangle and encloses x, and thus the theorem works for this instance, since |P′| = 3. It may help to visualise Carathéodory's theorem in 2 dimensions, as saying that we can construct a triangle consisting of points from P that encloses any point in P.

Proof

Let x be a point in the convex hull of P. Then, x is a convex combination of a finite number of points in P :

$\mathbf{x}=\sum_{j=1}^k \lambda_j \mathbf{x}_j$

where every xj is in P, every λj is positive, and $\sum_{j=1}^k\lambda_j=1$.

Suppose k > d + 1 (otherwise, there is nothing to prove). Then, the points x2 − x1, ..., xk − x1 are linearly dependent,

so there are real scalars μ2, ..., μk, not all zero, such that

$\sum_{j=2}^k \mu_j (\mathbf{x}_j-\mathbf{x}_1)=\mathbf{0}.$

If μ1 is defined as

$\mu_1:=-\sum_{j=2}^k \mu_j$

then

$\sum_{j=1}^k \mu_j \mathbf{x}_j=\mathbf{0}$
$\sum_{j=1}^k \mu_j=0$

and not all of the μj are equal to zero. Therefore, at least one μj > 0. Then,

$\mathbf{x} = \sum_{j=1}^k \lambda_j \mathbf{x}_j-\alpha\sum_{j=1}^k \mu_j \mathbf{x}_j = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j$

for any real α. In particular, the equality will hold if α is defined as

$\alpha:=\min_{1\leq j \leq k} \left\{ \tfrac{\lambda_j}{\mu_j}:\mu_j>0\right\}=\tfrac{\lambda_i}{\mu_i}.$

Note that α>0, and for every j between 1 and k,

$\lambda_j-\alpha\mu_j \geq 0.$

In particular, λi − αμi = 0 by definition of α. Therefore,

$\mathbf{x} = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j$

where every $\lambda_j - \alpha \mu_j$ is nonnegative, their sum is one , and furthermore, $\lambda_i-\alpha\mu_i=0$. In other words, x is represented as a convex combination of at most k-1 points of P. This process can be repeated until x is represented as a convex combination of at most d + 1 points in P.

An alternative proof uses Helly's theorem.