# Carathéodory's theorem (convex hull)

An illustration of Carathéodory's theorem for a square in R2

In convex geometry, Carathéodory's theorem states that if a point x of Rd lies in the convex hull of a set P, then x can be written as the convex combination of at most d + 1 points in P. Namely, there is a subset P′ of P consisting of d + 1 or fewer points such that x lies in the convex hull of P′. Equivalently, x lies in an r-simplex with vertices in P, where ${\displaystyle r\leq d}$. The smallest r that makes the last statement valid for each x in the convex hull of P is defined as the Carathéodory's number of P. Depending on the properties of P, upper bounds lower than the one provided by Carathéodory's theorem can be obtained.[1] Note that P need not be itself convex. A consequence of this is that P′ can always be extremal in P, as non-extremal points can be removed from P without changing the membership of x in the convex hull.

The similar theorems of Helly and Radon are closely related to Carathéodory's theorem: the latter theorem can be used to prove the former theorems and vice versa.[2]

The result is named for Constantin Carathéodory, who proved the theorem in 1907 for the case when P is compact.[3] In 1914 Ernst Steinitz expanded Carathéodory's theorem for any sets P in Rd.[4]

## Example

Consider a set P = {(0,0), (0,1), (1,0), (1,1)} which is a subset of R2. The convex hull of this set is a square. Consider now a point x = (1/4, 1/4), which is in the convex hull of P. We can then construct a set {(0,0),(0,1),(1,0)} = P′, the convex hull of which is a triangle and encloses x, and thus the theorem works for this instance, since |P′| = 3. It may help to visualise Carathéodory's theorem in 2 dimensions, as saying that we can construct a triangle consisting of points from P that encloses any point in P.

## Proof of Carathéodory's theorem

Let x be a point in the convex hull of P. Then, x is a convex combination of a finite number of points in P :

${\displaystyle \mathbf {x} =\sum _{j=1}^{k}\lambda _{j}\mathbf {x} _{j}}$

where every xj is in P, every λj is non-negative, and ${\displaystyle \sum _{j=1}^{k}\lambda _{j}=1}$.

Suppose k > d + 1 (otherwise, there is nothing to prove). Then, the vectors x2 − x1, ..., xk − x1 are linearly dependent,

so there are real scalars μ2, ..., μk, not all zero, such that

${\displaystyle \sum _{j=2}^{k}\mu _{j}(\mathbf {x} _{j}-\mathbf {x} _{1})=\mathbf {0} .}$

If μ1 is defined as

${\displaystyle \mu _{1}:=-\sum _{j=2}^{k}\mu _{j}}$

then

${\displaystyle \sum _{j=1}^{k}\mu _{j}\mathbf {x} _{j}=\mathbf {0} }$
${\displaystyle \sum _{j=1}^{k}\mu _{j}=0}$

and not all of the μj are equal to zero. Therefore, at least one μj > 0. Then,

${\displaystyle \mathbf {x} =\sum _{j=1}^{k}\lambda _{j}\mathbf {x} _{j}-\alpha \sum _{j=1}^{k}\mu _{j}\mathbf {x} _{j}=\sum _{j=1}^{k}(\lambda _{j}-\alpha \mu _{j})\mathbf {x} _{j}}$

for any real α. In particular, the equality will hold if α is defined as

${\displaystyle \alpha :=\min _{1\leq j\leq k}\left\{{\tfrac {\lambda _{j}}{\mu _{j}}}:\mu _{j}>0\right\}={\tfrac {\lambda _{i}}{\mu _{i}}}.}$

Note that α > 0, and for every j between 1 and k,

${\displaystyle \lambda _{j}-\alpha \mu _{j}\geq 0.}$

In particular, λi − αμi = 0 by definition of α. Therefore,

${\displaystyle \mathbf {x} =\sum _{j=1}^{k}(\lambda _{j}-\alpha \mu _{j})\mathbf {x} _{j}}$

where every ${\displaystyle \lambda _{j}-\alpha \mu _{j}}$ is nonnegative, their sum is one , and furthermore, ${\displaystyle \lambda _{i}-\alpha \mu _{i}=0}$. In other words, x is represented as a convex combination of at most k-1 points of P. This process can be repeated until x is represented as a convex combination of at most d + 1 points in P.

An alternative proof uses Helly's theorem.[5]