Empirical process

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The study of empirical processes is a branch of mathematical statistics and a sub-area of probability theory. It is a generalization of the central limit theorem for empirical measures. Applications of the theory of empirical processes arise in non-parametric statistics.

Contents

[edit] Definition

It is known that under certain conditions empirical measures P_n uniformly converge to the probability measure P (see Glivenko–Cantelli theorem). The theory of Empirical processes provides the rate of this convergence.

A centered and scaled version of the empirical measure is the signed measure

G_n(A)=\sqrt{n}(P_n(A)-P(A))

It induces a map on measurable functions f given by

f\mapsto G_n f=\sqrt{n}(P_n-P)f=\sqrt{n}\left(\frac{1}{n}\sum_{i=1}^n f(X_i)-\mathbb{E}f\right)

By the central limit theorem, G_n(A) converges in distribution to a normal random variable N(0, P(A)(1 − P(A))) for fixed measurable set A. Similarly, for a fixed function f, G_nf converges in distribution to a normal random variable N(0,\mathbb{E}(f-\mathbb{E}f)^2), provided that \mathbb{E}f and \mathbb{E}f^2 exist.

Definition

\bigl(G_n(c)\bigr)_{c\in\mathcal{C}} is called an empirical process indexed by \mathcal{C}, a collection of measurable subsets of S.
\bigl(G_nf\bigr)_{f\in\mathcal{F}} is called an empirical process indexed by \mathcal{F}, a collection of measurable functions from S to \mathbb{R}.

A significant result in the area of empirical processes is Donsker's theorem. It has led to a study of the Donsker classes such that empirical processes indexed by these classes converge weakly to a certain Gaussian process. It can be shown that the Donsker classes are Glivenko–Cantelli classes, the converse is not true in general.

[edit] Example

As an example, consider empirical distribution functions. For real-valued iid random variables X_1,X_n,\dots they are given by

F_n(x)=P_n((-\infty,x])=P_nI_{(-\infty,x]}.

In this case, empirical processes are indexed by a class \mathcal{C}=\{(-\infty,x]:x\in\mathbb{R}\}. It has been shown that \mathcal{C} is a Donsker class, in particular,

\sqrt{n}(F_n(x)-F(x)) converges weakly in \ell^\infty(\mathbb{R}) to a Brownian bridge B(F(x)) .

[edit] See also

[edit] References

  • P. Billingsley, Probability and Measure, John Wiley and Sons, New York, third edition, 1995.
  • M.D. Donsker, Justification and extension of Doob's heuristic approach to the Kolmogorov–Smirnov theorems, Annals of Mathematical Statistics, 23:277–281, 1952.
  • R.M. Dudley, Central limit theorems for empirical measures, Annals of Probability, 6(6): 899–929, 1978.
  • R.M. Dudley, Uniform Central Limit Theorems, Cambridge Studies in Advanced Mathematics, 63, Cambridge University Press, Cambridge, UK, 1999.
  • M.R. Kosorok, Introduction to Empirical Processes and Semiparametric Inference, Springer, New York, 2008.
  • Galen R. Shorack and Jon A. Wellner, Empirical Processes with Applications to Statistics, Wiley, New York, 1986. SIAM Classics edition (2009), Society for Industrial and Applied Mathematics. ISBN 978-0-898716-84-9
  • Aad W. van der Vaart and Jon A. Wellner,Weak Convergence and Empirical Processes: With Applications to Statistics, 2nd ed., Springer, 2000. ISBN 978-0-387-94640-5
  • J. Wolfowitz, Generalization of the theorem of Glivenko–Cantelli. Annals of Mathematical Statistics, 25, 131–138, 1954.
  • K.O. Dzhaparidze and M.S. Nikulin, Probability distributions for the Kolmogorov and omega-square statistics for continuous distributions with scale and shift parameters, Journal of Soviet Mathematics, 20(3):2147-2163, 1982.

[edit] External links

Personal tools
Namespaces

Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages