# Covering number

In mathematics, a covering number is the number of spherical balls of a given size needed to completely cover a given space, with possible overlaps. Two related concepts are the packing number, the number of disjoint balls that fit in a space, and the metric entropy, the number of points that fit in a space when constrained to lie at some fixed minimum distance apart.

## Definition

Let (M, d) be a metric space, let K be a subset of M, and let r be a positive real number. Let Br(x) denote the ball of radius r centered at x. A subset C of M is an r-external covering of K if:

${\displaystyle K\subseteq \cup _{x\in C}B_{r}(x)}$.

In other words, for every ${\displaystyle y\in K}$ there exists ${\displaystyle x\in C}$ such that ${\displaystyle d(x,y)\leq r}$.

If furthermore C is a subset of K, then it is an r-internal covering.

The external covering number of K, denoted ${\displaystyle N_{r}^{\text{ext}}(K)}$, is the minimum cardinality of any external covering of K. The internal covering number, denoted ${\displaystyle N_{r}^{\text{int}}(K)}$, is the minimum cardinality of any internal covering.

A subset P of K is a packing if ${\displaystyle P\subseteq K}$ and the set ${\displaystyle \{B_{r}(x)\}_{x\in P}}$ is pairwise disjoint. The packing number of K, denoted ${\displaystyle N_{r}^{\text{pack}}(K)}$, is the maximum cardinality of any packing of K.

A subset S of K is r-separated if each pair of points x and y in S satisfies d(x, y) ≥ r. The metric entropy of K, denoted ${\displaystyle N_{r}^{\text{met}}(K)}$, is the maximum cardinality of any r-separated subset of K.

## Examples

1. The metric space is the real line ${\displaystyle \mathbb {R} }$. ${\displaystyle K\subset \mathbb {R} }$ is a set of real numbers whose absolute value is at most ${\displaystyle k}$. Then, there is an external covering of ${\displaystyle \lceil 2k/r\rceil }$ intervals of length ${\displaystyle r}$, covering the interval ${\displaystyle [k,-k]}$. Hence:

${\displaystyle N_{r}^{\text{ext}}(K)\leq (2k/r)}$

2. The metric space is the Euclidean space ${\displaystyle \mathbb {R} ^{m}}$ with the Euclidean metric. ${\displaystyle K\subset \mathbb {R} ^{m}}$ is a set of vectors whose length (norm) is at most ${\displaystyle k}$. If ${\displaystyle K}$ lies in a d-dimensional subspace of ${\displaystyle \mathbb {R} ^{m}}$, then:[1]:337

${\displaystyle N_{r}^{\text{ext}}(K)\leq (2k{\sqrt {d}}/r)^{d}}$.

3. The metric space is the space of real-valued functions, with the l-infinity metric. The covering number ${\displaystyle N_{r}^{\text{int}}(K)}$ is the smallest number ${\displaystyle k}$ such that, there exist ${\displaystyle h_{1},\ldots ,h_{k}\in K}$ such that, for all ${\displaystyle h\in K}$ there exists ${\displaystyle i\in \{1,\ldots ,k\}}$ such that the supremum distance between ${\displaystyle h}$ and ${\displaystyle h_{i}}$ is at most ${\displaystyle r}$. The above bound is not relevant since the space is ${\displaystyle \infty }$-dimensional. However, when ${\displaystyle K}$ is a compact set, every covering of it has a finite sub-covering, so ${\displaystyle N_{r}^{\text{int}}(K)}$ is finite.[2]:61

## Properties

1. The internal and external covering numbers, the packing number, and the metric entropy are all closely related. The following chain of inequalities holds for any subset K of a metric space and any positive real number r.[3]

${\displaystyle N_{2r}^{\text{met}}(K)\leq N_{r}^{\text{pack}}(K)\leq N_{r}^{\text{ext}}(K)\leq N_{r}^{\text{int}}(K)\leq N_{r}^{\text{met}}(K)}$

2. Each function except the internal covering number is non-increasing in r and non-decreasing in K. The internal covering number is monotone in r but not necessarily in K.

The following properties relate to covering numbers in the standard Euclidean space ${\displaystyle \mathbb {R} ^{m}}$:[1]:338

3. If all vectors in ${\displaystyle K}$ are translated by a constant vector ${\displaystyle k_{0}\in \mathbb {R} ^{m}}$, then the covering number does not change.

4. If all vectors in ${\displaystyle K}$ are multiplied by a scalar ${\displaystyle k\in \mathbb {R} }$, then:

for all ${\displaystyle r}$: ${\displaystyle N_{|k|\cdot r}^{\text{ext}}(k\cdot K)=N_{r}^{\text{ext}}(K)}$

5. If all vectors in ${\displaystyle K}$ are operated by a Lipschitz function ${\displaystyle \phi }$ with Lipschitz constant ${\displaystyle k}$, then:

for all ${\displaystyle r}$: ${\displaystyle N_{|k|\cdot r}^{\text{ext}}(\phi \circ K)\leq N_{r}^{\text{ext}}(K)}$

## Application to machine learning

Let ${\displaystyle K}$ be a space of real-valued functions, with the l-infinity metric (see example 3 above). Suppose all functions in ${\displaystyle K}$ are bounded by a real constant ${\displaystyle M}$. Then, the covering number can be used to bound the generalization error of learning functions from ${\displaystyle K}$, relative to the squared loss:[2]:61

${\displaystyle \mathrm {Prob} {\big [}\sup _{h\in K}|{\text{GeneralizationError}}(h)-{\text{EmpiricalError}}(h)|\geq \epsilon {\big ]}\leq N_{r}^{\text{int}}(K)\cdot 2\exp {-m\epsilon ^{2} \over 2M^{4}}}$
where ${\displaystyle r={\epsilon \over 8M}}$ and ${\displaystyle m}$ is the number of samples.