V-statistics are a class of statistics named for Richard von Mises who developed their asymptotic distribution theory in a fundamental paper in 1947. V-statistics are closely related to U-statistics (U for “unbiased”) introduced by Wassily Hoeffding in 1948. A V-statistic is a statistical function (of a sample) defined by a particular statistical functional of a probability distribution.
Statistics that can be represented as functionals of the empirical distribution function are called statistical functions. Differentiability of the functional T plays a key role in the von Mises approach; thus von Mises considers differentiable statistical functionals.
Examples of statistical functions
- The k-th central moment is the functional , where is the expected value of X. The associated statistical function is the sample k-th central moment,
- The chi-squared goodness-of-fit statistic is a statistical function T(Fn), corresponding to the statistical functional
- The Cramér–von-Mises and Anderson–Darling goodness-of-fit statistics are based on the functional
Representation as a V-statistic
Suppose x1, ..., xn is a sample. In typical applications the statistical function has a representation as the V-statistic
where h is a symmetric kernel function. Serfling discusses how to find the kernel in practice. Vmn is called a V-statistic of degree m.
A symmetric kernel of degree 2 is a function h(x, y), such that h(x, y) = h(y, x) for all x and y in the domain of h. For samples x1, ..., xn, the corresponding V-statistic is defined
Example of a V-statistic
- An example of a degree-2 V-statistic is the second central moment m2. If h(x, y) = (x − y)2/2, the corresponding V-statistic is
Von Mises' approach is a unifying theory that covers all of the cases above. Informally, the type of asymptotic distribution of a statistical function depends on the order of "degeneracy," which is determined by which term is the first non-vanishing term in the Taylor expansion of the functional T. In case it is the linear term, the limit distribution is normal; otherwise higher order types of distributions arise (under suitable conditions such that a central limit theorem holds).
- Var(h(X1, ..., Xk)) = 0 for k < m, and Var(h(X1, ..., Xk)) > 0 for k = m;
- nm/2Rmn tends to zero (in probability). (Rmn is the remainder term in the Taylor series for T.)
Case m = 1 (Non-degenerate kernel):
In the variance example (4), m2 is asymptotically normal with mean and variance , where .
Case m = 2 (Degenerate kernel):
Suppose A(2) is true, and and . Then nV2,n converges in distribution to a weighted sum of independent chi-squared variables:
where are independent standard normal variables and are constants that depend on the distribution F and the functional T. In this case the asymptotic distribution is called a quadratic form of centered Gaussian random variables. The statistic V2,n is called a degenerate kernel V-statistic. The V-statistic associated with the Cramer–von Mises functional (Example 3) is an example of a degenerate kernel V-statistic.
- Hoeffding, W. (1948). "A class of statistics with asymptotically normal distribution". Annals of Mathematical Statistics 19 (3): 293–325. doi:10.1214/aoms/1177730196. JSTOR 2235637.
- Koroljuk, V.S.; Borovskich, Yu.V. (1994). Theory of U-statistics (English translation by P.V.Malyshev and D.V.Malyshev from the 1989 Ukrainian ed.). Dordrecht: Kluwer Academic Publishers. ISBN 0-7923-2608-3.
- Lee, A.J. (1990). U-Statistics: theory and practice. New York: Marcel Dekker, Inc. ISBN 0-8247-8253-4.
- Neuhaus, G. (1977). "Functional limit theorems for U-statistics in the degenerate case". Journal of Multivariate Analysis 7 (3): 424–439. doi:10.1016/0047-259X(77)90083-5.
- Rosenblatt, M. (1952). "Limit theorems associated with variants of the von Mises statistic". Annals of Mathematical Statistics 23 (4): 617–623. doi:10.1214/aoms/1177729341. JSTOR 2236587.
- Serfling, R.J. (1980). Approximation theorems of mathematical statistics. New York: John Wiley & Sons. ISBN 0-471-02403-1.
- Taylor, R.L.; Daffer, P.Z.; Patterson, R.F. (1985). Limit theorems for sums of exchangeable random variables. New Jersey: Rowman and Allanheld.
- von Mises, R. (1947). "On the asymptotic distribution of differentiable statistical functions". Annals of Mathematical Statistics 18 (2): 309–348. doi:10.1214/aoms/1177730385. JSTOR 2235734.