Edgeworth series: Difference between revisions

Content deleted Content added

Inline

Revision as of 08:58, 25 September 2014

The Gram–Charlier A series (named in honor of Jørgen Pedersen Gram and Carl Charlier), and the Edgeworth series (named in honor of Francis Ysidro Edgeworth) are series that approximate a probability distribution in terms of its cumulants. The series are the same; but, the arrangement of terms (and thus the accuracy of truncating the series) differ.

Gram–Charlier A series

The key idea of these expansions is to approximate the probability density function $F$ of a distribution in terms of the characteristic function of a distribution with known and suitable properties, and to recover $F$ through the inverse Fourier transform.

We examine a continuous random variable. Let $f$ be the characteristic function of its distribution whose density function is $F$ , and $\kappa _{r}$ its cumulants. We expand in terms of a known distribution with probability density function $Ψ$ , characteristic function $ψ$ , and cumulants $\gamma _{r}$ . The density $Ψ$ is generally chosen to be that of the normal distribution, but other choices are possible as well. By the definition of the cumulants, we have the following formal identity:

f(t)=\exp \left[\sum _{r=1}^{\infty }(\kappa _{r}-\gamma _{r}){\frac {(it)^{r}}{r!}}\right]\psi (t)\,.

By the properties of the Fourier transform, $it^{r}\psi (t)$ is the Fourier transform of $(-1)^{r}D^{r}\Psi (x)$ , where $D$ is the differential operator with respect to $x$ . Thus, we find for $F$ the formal expansion

F(x)=\exp \left[\sum _{r=1}^{\infty }(\kappa _{r}-\gamma _{r}){\frac {(-D)^{r}}{r!}}\right]\Psi (x)\,.

If $Ψ$ is chosen as the normal density with mean and variance as given by $F$ , that is, mean $\mu =\kappa _{1}$ and variance $\sigma ^{2}=\kappa _{2}$ , then the expansion becomes

F(x)=\exp \left[\sum _{r=3}^{\infty }\kappa _{r}{\frac {(-D)^{r}}{r!}}\right]{\frac {1}{{\sqrt {2\pi }}\sigma }}\exp \left[-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}\right].

By expanding the exponential and collecting terms according to the order of the derivatives, we arrive at the Gram–Charlier A series. If we include only the first two correction terms to the normal distribution, we obtain

F(x)\approx {\frac {1}{{\sqrt {2\pi }}\sigma }}\exp \left[-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}\right]\left[1+{\frac {\kappa _{3}}{3!\sigma ^{3}}}H_{3}\left({\frac {x-\mu }{\sigma }}\right)+{\frac {\kappa _{4}}{4!\sigma ^{4}}}H_{4}\left({\frac {x-\mu }{\sigma }}\right)\right]\,,

with $H_{3}(x)=x^{3}-3x$ and $H_{4}(x)=x^{4}-6x^{2}+3$ (these are Hermite polynomials).

Note that this expression is not guaranteed to be positive, and is therefore not a valid probability distribution. The Gram–Charlier A series diverges in many cases of interest—it converges only if $F(x)$ falls off faster than $\exp(-(x^{2})/4)$ at infinity (Cramér 1957). When it does not converge, the series is also not a true asymptotic expansion, because it is not possible to estimate the error of the expansion. For this reason, the Edgeworth series (see next section) is generally preferred over the Gram–Charlier A series.

Edgeworth series

Edgeworth developed a similar expansion as an improvement to the central limit theorem. The advantage of the Edgeworth series is that the error is controlled, so that it is a true asymptotic expansion.

Let {X_i} be a sequence of independent and identically distributed random variables with mean μ and variance σ², and let Y_n be their standardized sums:

Y_{n}={\frac {1}{\sqrt {n}}}\sum _{i=1}^{n}{\frac {X_{i}-\mu }{\sigma }}.

Let F_n denote the cumulative distribution functions of the variables Y_n. Then by the central limit theorem,

\lim _{n\to \infty }F_{n}(x)=\Phi (x)\equiv \int _{-\infty }^{x}{\tfrac {1}{\sqrt {2\pi }}}e^{-{\frac {1}{2}}q^{2}}dq

for every x, as long as the mean and variance are finite.

Now assume that the random variables X_i have mean μ, variance σ², and higher cumulants κ_r=σ^rλ_r. If we expand in terms of the standard normal distribution, that is, if we set

\Psi (x)={\frac {1}{\sqrt {2\pi }}}\exp(-{\tfrac {1}{2}}x^{2})

then the cumulant differences in the formal expression of the characteristic function f_n(t) of F_n are

\kappa _{1}^{F(n)}-\gamma _{1}=0,

\kappa _{2}^{F(n)}-\gamma _{2}=0,

\kappa _{r}^{F(n)}-\gamma _{r}={\frac {\kappa _{r}}{\sigma ^{r}n^{r/2-1}}}={\frac {\lambda _{r}}{n^{r/2-1}}};\qquad r\geq 3.

The Edgeworth series is developed similarly to the Gram–Charlier A series, only that now terms are collected according to powers of n. Thus, we have

f_{n}(t)=\left[1+\sum _{j=1}^{\infty }{\frac {P_{j}(it)}{n^{j/2}}}\right]\exp(-t^{2}/2)\,,

where P_j(x) is a polynomial of degree 3j. Again, after inverse Fourier transform, the density function F_n follows as

F_{n}(x)=\Phi (x)+\sum _{j=1}^{\infty }{\frac {P_{j}(-D)}{n^{j/2}}}\Phi (x)\,.

The first five terms of the expansion are^[1]

{\begin{aligned}F_{n}(x)&=\Phi (x)\\&\quad -{\frac {1}{n^{\frac {1}{2}}}}\left({\tfrac {1}{6}}\lambda _{3}\,\Phi ^{(3)}(x)\right)\\&\quad +{\frac {1}{n}}\left({\tfrac {1}{24}}\lambda _{4}\,\Phi ^{(4)}(x)+{\tfrac {1}{72}}\lambda _{3}^{2}\,\Phi ^{(6)}(x)\right)\\&\quad -{\frac {1}{n^{\frac {3}{2}}}}\left({\tfrac {1}{120}}\lambda _{5}\,\Phi ^{(5)}(x)+{\tfrac {1}{144}}\lambda _{3}\lambda _{4}\,\Phi ^{(7)}(x)+{\tfrac {1}{1296}}\lambda _{3}^{3}\,\Phi ^{(9)}(x)\right)\\&\quad +{\frac {1}{n^{2}}}\left({\tfrac {1}{720}}\lambda _{6}\,\Phi ^{(6)}(x)+\left({\tfrac {1}{1152}}\lambda _{4}^{2}+{\tfrac {1}{720}}\lambda _{3}\lambda _{5}\right)\Phi ^{(8)}(x)+{\tfrac {1}{1728}}\lambda _{3}^{2}\lambda _{4}\,\Phi ^{(10)}(x)+{\tfrac {1}{31104}}\lambda _{3}^{4}\,\Phi ^{(12)}(x)\right)\\&\quad +O\left(n^{-{\frac {5}{2}}}\right).\end{aligned}}

Here, $Φ (j) (x)$ is the j-th derivative of $Φ(\cdot)$ at point x. Remembering that the derivatives of the density of the normal distribution are related to the normal density by ϕ⁽ⁿ⁾(x) is (-1)ⁿH_n(x)ϕ(x), (where H_n is the Hermite polynomial of order n), this explains the alternative representations in terms of the density function. Blinnikov and Moessner (1998) have given a simple algorithm to calculate higher-order terms of the expansion.

Note that in case of a lattice distributions (which have discrete values), the Edgeworth expansion must be adjusted to account for the discontinuous jumps between lattice points.^[2]

Illustration: density of the sample mean of 3 Χ²

Take $X_{i}\sim \chi ^{2}(k=2)\qquad i=1,2,3$ and the sample mean ${\bar {X}}={\frac {1}{3}}\sum _{i=1}^{3}X_{i}$ .

We can use several distributions for ${\bar {X}}$ :

The exact distribution, which follows a gamma distribution: ${\bar {X}}\sim \mathrm {Gamma} \left(\alpha =n\cdot k/2,\theta =2/n\right)$ = $\mathrm {Gamma} \left(\alpha =3,\theta =2/3\right)$
The asymptotic normal distribution: ${\bar {X}}{\xrightarrow {n\to \infty }}N(k,2\cdot k/n)=N(2,4/3)$
Two Edgeworth expansion, of degree 2 and 3

Disadvantages of the Edgeworth expansion

Edgeworth expansions can suffer from a few issues:

They are not guaranteed to be a proper probability distribution as:
- The integral of the density needs not integrate to 1
- Probabilities can be negative
They can be inaccurate, especially in the tails, due to mainly two reasons:
- They are obtained under a Taylor series around the mean
- They guarantee (asymptotically) an absolute error, not a relative one. This is an issue when one wants to approximate very small quantities, for which the absolute error might be small, but the relative error important.

References

^ Weisstein, Eric W. "Edgeworth Series". MathWorld.
^ Kolassa, John E., and Peter McCullagh. "Edgeworth series for lattice distributions." The Annals of Statistics 18.2 (1990): 981-985.‏

@@ Line 2: / Line 2: @@
 ==Gram–Charlier A series==
-The key idea of these expansions is to write the [[Characteristic function (probability theory)|characteristic function]] of the distribution whose [[probability density function]] {{mvar|F}} is to be approximated in terms of the characteristic function of a distribution with known and suitable properties, and to recover {{mvar|F}} through the inverse [[Fourier transform]].
+The key idea of these expansions is to approximate the [[probability density function]] {{mvar|F}} of a distribution in terms of the [[characteristic function]] of a distribution with known and suitable properties, and to recover {{mvar|F}} through the inverse [[Fourier transform]].
 We examine a continuous random variable. Let <math>f</math> be the characteristic function of its distribution whose density function is {{mvar|F}}, and <math>\kappa_r</math> its [[cumulant]]s. We expand in terms of a known distribution with probability density function {{math|Ψ}}, characteristic function {{mvar|ψ}}, and cumulants <math>\gamma_r</math>. The density {{math|Ψ}} is generally chosen to be that of the [[normal distribution]], but other choices are possible as well. By the definition of the cumulants, we have the following formal identity: