Lebesgue differentiation theorem

In mathematics, the Lebesgue differentiation theorem is a theorem of real analysis, which states that for almost every point, the value of an integrable function is the limit of infinitesimal averages taken about the point. The theorem is named for Henri Lebesgue.

Statement

For a Lebesgue integrable real or complex-valued function f on Rn, the indefinite integral is a set function which maps a measurable set A  to the Lebesgue integral of $f \cdot \mathbf{1}_A$, where $\mathbf{1}_{A}$ denotes the characteristic function of the set A. It is usually written

$\int_{A}f\ \mathrm{d}\lambda,$

with λ the n–dimensional Lebesgue measure.

The derivative of this integral at x is defined to be

$\lim_{B \rightarrow x} \frac{1}{|B|} \int_{B}f \, \mathrm{d}\lambda,$

where |B| denotes the volume (i.e., the Lebesgue measure) of a ball B  centered at x, and B → x means that the diameter of B  tends to 0.
The Lebesgue differentiation theorem (Lebesgue 1910) states that this derivative exists and is equal to f(x) at almost every point x ∈ Rn. In fact a slightly stronger statement is true. Note that:

$\left|\frac{1}{|B|} \int_{B}f(y) \, \mathrm{d}\lambda(y) - f(x)\right| = \left|\frac{1}{|B|} \int_{B}f(y) -f(x)\, \mathrm{d}\lambda(y)\right| \le \frac{1}{|B|} \int_{B}|f(y) -f(x)|\, \mathrm{d}\lambda(y).$

The stronger assertion is that the right hand side tends to zero for almost every point x. The points x for which this is true are called the Lebesgue points of f.

A more general version also holds. One may replace the balls B  by a family $\mathcal{V}$ of sets U  of bounded eccentricity. This means that there exists some fixed c > 0 such that each set U  from the family is contained in a ball B  with $|U| \ge c \, |B|$. It is also assumed that every point x ∈ Rn is contained in arbitrarily small sets from $\mathcal{V}$. When these sets shrink to x, the same result holds: for almost every point x,

$f(x) = \lim_{U \rightarrow x, \, U \in \mathcal{V}} \frac{1}{|U|} \int_U f \, \mathrm{d}\lambda.$

The family of cubes is an example of such a family $\mathcal{V}$, as is the family $\mathcal{V}$(m) of rectangles in R2 such that the ratio of sides stays between m−1 and m, for some fixed m ≥ 1. If an arbitrary norm is given on Rn, the family of balls for the metric associated to the norm is another example.

The one-dimensional case was proved earlier by Lebesgue (1904). If f is integrable on the real line, the function

$F(x) = \int_{-\infty}^x f(t) \, \mathrm{d} t$

is almost everywhere differentiable, with $F'(x) = f(x).$

Proof

The theorem in its stronger form—that almost every point is a Lebesgue point of a locally integrable function f—can be proved as a consequence of the weak–L1 estimates for the Hardy–Littlewood maximal function. The proof below follows the standard treatment that can be found in Benedetto & Czaja (2009), Stein & Shakarchi (2005), Wheeden & Zygmund (1977) and Rudin (1987).

Since the statement is local in character, f can be assumed to be zero outside some ball of finite radius and hence integrable. It is then sufficient to prove that the set

$E_\alpha = \Bigl\{ x \in \mathbf{R}^n :\limsup_{|B|\rightarrow 0, \, x \in B} \frac{1}{|B|} \int_B |f(y)-f(x)| \, \mathrm{d}y > 2\alpha \Bigr\}$

has measure 0 for all α > 0.

Let ε > 0 be given. Using the density of continuous functions of compact support in L1(Rn), one can find such a function g satisfying

$\|f - g\|_{L^1} = \int_{\mathbf{R}^n} |f(x) - g(x)| \, \mathrm{d}x < \varepsilon.$

It is then helpful to rewrite the main difference as

$\frac{1}{|B|} \int_B f(y) \, \mathrm{d}y - f(x) = \Bigl(\frac{1}{|B|} \int_B \bigl(f(y) - g(y)\bigr) \, \mathrm{d}y \Bigr) + \Bigl(\frac{1}{|B|}\int_B g(y) \, \mathrm{d}y - g(x) \Bigr)+ \bigl(g(x) - f(x)\bigr).$

The first term can be bounded by the value at x of the maximal function for f − g, denoted here by $(f-g)^*(x)$:

$\frac{1}{|B|} \int_B |f(y) - g(y)| \, \mathrm{d}y \leq \sup_{r>0} \frac{1}{|B_r(x)|}\int_{B_r(x)} |f(y)-g(y)| \, \mathrm{d}y = (f-g)^*(x).$

The second term disappears in the limit since g is a continuous function, and the third term is bounded by |f(x) − g(x)|. For the absolute value of the original difference to be greater than 2α in the limit, at least one of the first or third terms must be greater than α in absolute value. However, the estimate on the Hardy–Littlewood function says that

$\Bigl| \left \{ x : (f-g)^*(x) > \alpha \right \} \Bigr| \leq \frac{A_n}{\alpha} \, \|f - g\|_{L^1} < \frac{A_n}{\alpha} \, \varepsilon,$

for some constant An depending only upon the dimension n. The Markov inequality (also called Tchebyshev's inequality) says that

$\Bigl|\left\{ x : |f(x) - g(x)| > \alpha \right \}\Bigr| \leq \frac{1}{\alpha} \, \|f - g\|_{L^1} < \frac{1}{\alpha} \, \varepsilon$

whence

$|E_\alpha| \leq \frac{A_n+1}{\alpha} \, \varepsilon.$

Since ε was arbitrary, it can be taken to be arbitrarily small, and the theorem follows.

Discussion of proof

The Vitali covering lemma is vital to the proof of this theorem; its role lies in proving the estimate for the Hardy-Littlewood maximal function.

The theorem also holds if balls are replaced, in the definition of the derivative, by families of sets with diameter tending to zero satisfying the Lebesgue's regularity condition, defined above as family of sets with bounded eccentricity. This follows since the same substitution can be made in the statement of the Vitali covering lemma.

Discussion

This is an analogue, and a generalization, of the fundamental theorem of calculus, which equates a Riemann integrable function and the derivative of its (indefinite) integral. It is also possible to show a converse - that every differentiable function is equal to the integral of its derivative, but this requires a Henstock-Kurzweil integral in order to be able to integrate an arbitrary derivative.

A special case of the Lebesgue differentiation theorem is the Lebesgue density theorem, which is equivalent to the differentiation theorem for characteristic functions of measurable sets. The density theorem is usually proved using a simpler method (e.g. see Measure and Category).

This theorem is also true for every finite Borel measure on Rn instead of Lebesgue measure (a proof can be found in e.g. (Ledrappier&Young 1985)).