Empirical process

From Wikipedia, the free encyclopedia
Jump to: navigation, search
For the process control topic, see Empirical process (process control model).

In probability theory, an empirical process is a stochastic process that describes the proportion of objects in a system in a given state. For a process in a discrete state space a population continuous time Markov chain[1][2] or Markov population model[3] is a process which counts the number of objects in a given state (without rescaling). In mean field theory, limit theorems (as the number of objects becomes large) are considered and generalise the central limit theorem for empirical measures. Applications of the theory of empirical processes arise in non-parametric statistics.[4]

Definition[edit]

For X1, X2, ... Xn independent and identically-distributed random variables in R with common cumulative distribution function F(x), the empirical distribution function is defined by

F_n(x)=\frac{1}{n}\sum_{i=1}^n I_{(-\infty,x]}(X_i),

where IC is the indicator function of the set C.

For every (fixed) x, Fn(x) is a sequence of random variables which converge to F(x) almost surely by the strong law of large numbers. That is, Fn converges to F pointwise. Glivenko and Cantelli strengthened this result by proving uniform convergence of Fn to F by the Glivenko–Cantelli theorem.[5]

A centered and scaled version of the empirical measure is the signed measure

G_n(A)=\sqrt{n}(P_n(A)-P(A))

It induces a map on measurable functions f given by

f\mapsto G_n f=\sqrt{n}(P_n-P)f=\sqrt{n}\left(\frac{1}{n}\sum_{i=1}^n f(X_i)-\mathbb{E}f\right)

By the central limit theorem, G_n(A) converges in distribution to a normal random variable N(0, P(A)(1 − P(A))) for fixed measurable set A. Similarly, for a fixed function f, G_nf converges in distribution to a normal random variable N(0,\mathbb{E}(f-\mathbb{E}f)^2), provided that \mathbb{E}f and \mathbb{E}f^2 exist.

Definition

\bigl(G_n(c)\bigr)_{c\in\mathcal{C}} is called an empirical process indexed by \mathcal{C}, a collection of measurable subsets of S.
\bigl(G_nf\bigr)_{f\in\mathcal{F}} is called an empirical process indexed by \mathcal{F}, a collection of measurable functions from S to \mathbb{R}.

A significant result in the area of empirical processes is Donsker's theorem. It has led to a study of Donsker classes: sets of functions with the useful property that empirical processes indexed by these classes converge weakly to a certain Gaussian process. While it can be shown that Donsker classes are Glivenko–Cantelli classes, the converse is not true in general.

Example[edit]

As an example, consider empirical distribution functions. For real-valued iid random variables X1, X2, ..., Xn they are given by

F_n(x)=P_n((-\infty,x])=P_nI_{(-\infty,x]}.

In this case, empirical processes are indexed by a class \mathcal{C}=\{(-\infty,x]:x\in\mathbb{R}\}. It has been shown that \mathcal{C} is a Donsker class, in particular,

\sqrt{n}(F_n(x)-F(x)) converges weakly in \ell^\infty(\mathbb{R}) to a Brownian bridge B(F(x)) .

See also[edit]

References[edit]

  1. ^ Bortolussi, L.; Hillston, J.; Latella, D.; Massink, M. (2013). "Continuous approximation of collective systems behaviour: A tutorial". Performance Evaluation. doi:10.1016/j.peva.2013.01.001.  edit
  2. ^ Stefanek, A.; Hayden, R. A.; Mac Gonagle, M.; Bradley, J. T. (2012). "Mean-Field Analysis of Markov Models with Reward Feedback". Analytical and Stochastic Modeling Techniques and Applications. Lecture Notes in Computer Science 7314. p. 193. doi:10.1007/978-3-642-30782-9_14. ISBN 978-3-642-30781-2.  edit
  3. ^ Dayar, T. R.; Hermanns, H.; Spieler, D.; Wolf, V. (2011). "Bounding the equilibrium distribution of Markov population models". Numerical Linear Algebra with Applications 18 (6): 931. doi:10.1002/nla.795.  edit
  4. ^ Mojirsheibani, M. (2007). "Nonparametric curve estimation with missing data: A general empirical process approach". Journal of Statistical Planning and Inference 137 (9): 2733–2758. doi:10.1016/j.jspi.2006.02.016.  edit
  5. ^ Wolfowitz, J. (1954). "Generalization of the Theorem of Glivenko-Cantelli". The Annals of Mathematical Statistics 25: 131. doi:10.1214/aoms/1177728852.  edit

Further reading[edit]

  • Billingsley, P. (1995). Probability and Measure (Third ed.). New York: John Wiley and Sons. ISBN 0471007102. 
  • Donsker, M. D. (1952). "Justification and Extension of Doob's Heuristic Approach to the Kolmogorov- Smirnov Theorems". The Annals of Mathematical Statistics 23 (2): 277. doi:10.1214/aoms/1177729445.  edit
  • Dudley, R. M. (1978). "Central Limit Theorems for Empirical Measures". The Annals of Probability 6 (6): 899. doi:10.1214/aop/1176995384.  edit
  • Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics 63. Cambridge, UK: Cambridge University Press. 
  • Kosorok, M. R. (2008). Introduction to Empirical Processes and Semiparametric Inference. Springer Series in Statistics. doi:10.1007/978-0-387-74978-5. ISBN 978-0-387-74977-8.  edit
  • Shorack, G. R.; Wellner, J. A. (2009). Empirical Processes with Applications to Statistics. doi:10.1137/1.9780898719017. ISBN 978-0-89871-684-9.  edit
  • van der Vaart, Aad W.; Wellner, Jon A. (2000). Weak Convergence and Empirical Processes: With Applications to Statistics (2nd ed.). Springer. ISBN 978-0-387-94640-5. 
  • Dzhaparidze, K. O.; Nikulin, M. S. (1982). "Probability distributions of the Kolmogorov and omega-square statistics for continuous distributions with shift and scale parameters". Journal of Soviet Mathematics 20 (3): 2147. doi:10.1007/BF01239992.  edit

External links[edit]