# Edgeworth series

The Gram–Charlier A series (named in honor of Jørgen Pedersen Gram and Carl Charlier), and the Edgeworth series (named in honor of Francis Ysidro Edgeworth) are series that approximate a probability distribution in terms of its cumulants.[1] The series are the same; but, the arrangement of terms (and thus the accuracy of truncating the series) differ.[2]

## Gram–Charlier A series

The key idea of these expansions is to write the characteristic function of the distribution whose probability density function F is to be approximated in terms of the characteristic function of a distribution with known and suitable properties, and to recover F through the inverse Fourier transform.

We examine a continuous random variable. Let ${\displaystyle f}$ be the characteristic function of its distribution whose density function is F, and ${\displaystyle \kappa _{r}}$ its cumulants. We expand in terms of a known distribution with probability density function Ψ, characteristic function ψ, and cumulants ${\displaystyle \gamma _{r}}$. The density Ψ is generally chosen to be that of the normal distribution, but other choices are possible as well. By the definition of the cumulants, we have (see Wallace, 1958)[3]

${\displaystyle f(t)=\exp \left[\sum _{r=1}^{\infty }\kappa _{r}{\frac {(it)^{r}}{r!}}\right]}$ and
${\displaystyle \psi (t)=\exp \left[\sum _{r=1}^{\infty }\gamma _{r}{\frac {(it)^{r}}{r!}}\right],}$

which gives the following formal identity:

${\displaystyle f(t)=\exp \left[\sum _{r=1}^{\infty }(\kappa _{r}-\gamma _{r}){\frac {(it)^{r}}{r!}}\right]\psi (t)\,.}$

By the properties of the Fourier transform, ${\displaystyle (it)^{r}\psi (t)}$ is the Fourier transform of ${\displaystyle (-1)^{r}[D^{r}\Psi ](-x)}$, where D is the differential operator with respect to x. Thus, after changing ${\displaystyle x}$ with ${\displaystyle -x}$ on both sides of the equation, we find for F the formal expansion

${\displaystyle F(x)=\exp \left[\sum _{r=1}^{\infty }(\kappa _{r}-\gamma _{r}){\frac {(-D)^{r}}{r!}}\right]\Psi (x)\,.}$

If Ψ is chosen as the normal density with mean and variance as given by F, that is, mean ${\displaystyle \mu =\kappa _{1}}$ and variance ${\displaystyle \sigma ^{2}=\kappa _{2}}$, then the expansion becomes

${\displaystyle F(x)=\exp \left[\sum _{r=3}^{\infty }\kappa _{r}{\frac {(-D)^{r}}{r!}}\right]{\frac {1}{{\sqrt {2\pi }}\sigma }}\exp \left[-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}\right],}$

since ${\displaystyle \gamma _{r}=0}$ for all r >2 as higher cumulants of the normal distribution are 0. By expanding the exponential and collecting terms according to the order of the derivatives, we arrive at the Gram–Charlier A series. If we include only the first two correction terms to the normal distribution, we obtain

${\displaystyle F(x)\approx {\frac {1}{{\sqrt {2\pi }}\sigma }}\exp \left[-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}\right]\left[1+{\frac {\kappa _{3}}{3!\sigma ^{3}}}H_{3}\left({\frac {x-\mu }{\sigma }}\right)+{\frac {\kappa _{4}}{4!\sigma ^{4}}}H_{4}\left({\frac {x-\mu }{\sigma }}\right)\right]\,,}$

with ${\displaystyle H_{3}(x)=x^{3}-3x}$ and ${\displaystyle H_{4}(x)=x^{4}-6x^{2}+3}$ (these are Hermite polynomials).

Note that this expression is not guaranteed to be positive, and is therefore not a valid probability distribution. The Gram–Charlier A series diverges in many cases of interest—it converges only if ${\displaystyle F(x)}$ falls off faster than ${\displaystyle \exp(-(x^{2})/4)}$ at infinity (Cramér 1957). When it does not converge, the series is also not a true asymptotic expansion, because it is not possible to estimate the error of the expansion. For this reason, the Edgeworth series (see next section) is generally preferred over the Gram–Charlier A series.

## The Edgeworth series

Edgeworth developed a similar expansion as an improvement to the central limit theorem.[4] The advantage of the Edgeworth series is that the error is controlled, so that it is a true asymptotic expansion.

Let ${\displaystyle \{Z_{i}\}}$ be a sequence of independent and identically distributed random variables with mean ${\displaystyle \mu }$ and variance ${\displaystyle \sigma ^{2}}$, and let ${\displaystyle X_{n}}$ be their standardized sums:

${\displaystyle X_{n}={\frac {1}{\sqrt {n}}}\sum _{i=1}^{n}{\frac {Z_{i}-\mu }{\sigma }}.}$

Let ${\displaystyle F_{n}}$ denote the cumulative distribution functions of the variables ${\displaystyle X_{n}}$. Then by the central limit theorem,

${\displaystyle \lim _{n\to \infty }F_{n}(x)=\Phi (x)\equiv \int _{-\infty }^{x}{\tfrac {1}{\sqrt {2\pi }}}e^{-{\frac {1}{2}}q^{2}}dq}$

for every ${\displaystyle x}$, as long as the mean and variance are finite.

Now assume that, in addition to having mean ${\displaystyle \mu }$ and variance ${\displaystyle \sigma ^{2}}$, the i.i.d. random variables ${\displaystyle Z_{i}}$ have higher cumulants ${\displaystyle \kappa _{\ell }=\sigma ^{\ell }\lambda _{\ell }}$. If we expand in terms of the standard normal distribution, that is, if we set

${\displaystyle \Psi (x)={\frac {1}{\sqrt {2\pi }}}\exp(-{\tfrac {1}{2}}x^{2})}$

then the cumulant differences in the formal expression of the characteristic function ${\displaystyle f_{n}(t)}$ of ${\displaystyle F_{n}}$ are

${\displaystyle \kappa _{1}^{F_{n}}-\gamma _{1}=0,}$
${\displaystyle \kappa _{2}^{F_{n}}-\gamma _{2}=0,}$
${\displaystyle \kappa _{r}^{F_{n}}-\gamma _{r}={\frac {\kappa _{r}}{\sigma ^{r}n^{r/2-1}}}={\frac {\lambda _{r}}{n^{r/2-1}}};\qquad r\geq 3.}$

The Edgeworth series is developed similarly to the Gram–Charlier A series, only that now terms are collected according to powers of ${\displaystyle n}$. Thus, we have

${\displaystyle f_{n}(t)=\left[1+\sum _{j=1}^{\infty }{\frac {P_{j}(it)}{n^{j/2}}}\right]\exp(-t^{2}/2)\,,}$

where ${\displaystyle P_{j}(x)}$ is a polynomial of degree ${\displaystyle 3j}$. Again, after inverse Fourier transform, the density function ${\displaystyle F_{n}}$ follows as

${\displaystyle F_{n}(x)=\Phi (x)+\sum _{j=1}^{\infty }{\frac {P_{j}(-D)}{n^{j/2}}}\Phi (x)\,.}$

The first five terms of the expansion are[5]

{\displaystyle {\begin{aligned}F_{n}(x)&=\Phi (x)\\&\quad -{\frac {1}{n^{\frac {1}{2}}}}\left({\tfrac {1}{6}}\lambda _{3}\,\Phi ^{(3)}(x)\right)\\&\quad +{\frac {1}{n}}\left({\tfrac {1}{24}}\lambda _{4}\,\Phi ^{(4)}(x)+{\tfrac {1}{72}}\lambda _{3}^{2}\,\Phi ^{(6)}(x)\right)\\&\quad -{\frac {1}{n^{\frac {3}{2}}}}\left({\tfrac {1}{120}}\lambda _{5}\,\Phi ^{(5)}(x)+{\tfrac {1}{144}}\lambda _{3}\lambda _{4}\,\Phi ^{(7)}(x)+{\tfrac {1}{1296}}\lambda _{3}^{3}\,\Phi ^{(9)}(x)\right)\\&\quad +{\frac {1}{n^{2}}}\left({\tfrac {1}{720}}\lambda _{6}\,\Phi ^{(6)}(x)+\left({\tfrac {1}{1152}}\lambda _{4}^{2}+{\tfrac {1}{720}}\lambda _{3}\lambda _{5}\right)\Phi ^{(8)}(x)+{\tfrac {1}{1728}}\lambda _{3}^{2}\lambda _{4}\,\Phi ^{(10)}(x)+{\tfrac {1}{31104}}\lambda _{3}^{4}\,\Phi ^{(12)}(x)\right)\\&\quad +O\left(n^{-{\frac {5}{2}}}\right).\end{aligned}}}

Here, Φ(j)(x) is the j-th derivative of Φ(·) at point x. Remembering that the derivatives of the density of the normal distribution are related to the normal density by ${\displaystyle \phi ^{(n)}(x)=(-1)^{n}H_{n}(x)\phi (x)}$, (where ${\displaystyle H_{n}}$ is the Hermite polynomial of order n), this explains the alternative representations in terms of the density function. Blinnikov and Moessner (1998) have given a simple algorithm to calculate higher-order terms of the expansion.

Note that in case of a lattice distributions (which have discrete values), the Edgeworth expansion must be adjusted to account for the discontinuous jumps between lattice points.[6]

## Illustration: density of the sample mean of three ${\displaystyle \chi ^{2}}$

Density of the sample mean of three chi2 variables. The chart compares the true density, the normal approximation, and two edgeworth expansions

Take ${\displaystyle X_{i}\sim \chi ^{2}(k=2)\qquad i=1,2,3}$ and the sample mean ${\displaystyle {\bar {X}}={\frac {1}{3}}\sum _{i=1}^{3}X_{i}}$.

We can use several distributions for ${\displaystyle {\bar {X}}}$:

• The exact distribution, which follows a gamma distribution: ${\displaystyle {\bar {X}}\sim \mathrm {Gamma} \left(\alpha =n\cdot k/2,\theta =2/n\right)}$ = ${\displaystyle \mathrm {Gamma} \left(\alpha =3,\theta =2/3\right)}$
• The asymptotic normal distribution: ${\displaystyle {\bar {X}}{\xrightarrow {n\to \infty }}N(k,2\cdot k/n)=N(2,4/3)}$
• Two Edgeworth expansion, of degree 2 and 3

## Disadvantages of the Edgeworth expansion

Edgeworth expansions can suffer from a few issues:

• They are not guaranteed to be a proper probability distribution as:
• The integral of the density need not integrate to 1
• Probabilities can be negative
• They can be inaccurate, especially in the tails, due to mainly two reasons:
• They are obtained under a Taylor series around the mean
• They guarantee (asymptotically) an absolute error, not a relative one. This is an issue when one wants to approximate very small quantities, for which the absolute error might be small, but the relative error important.

## References

1. ^ Stuart, A., & Kendall, M. G. (1968). The advanced theory of statistics. Hafner Publishing Company.
2. ^ Kolassa, J. E. (2006). Series approximation methods in statistics (Vol. 88). Springer Science & Business Media.
3. ^ Wallace, D. L. (1958). Asymptotic approximations to distributions. The Annals of Mathematical Statistics, 635-654.
4. ^ Hall, P. (2013). The bootstrap and Edgeworth expansion. Springer Science & Business Media.
5. ^
6. ^ Kolassa, John E.; McCullagh, Peter (1990). "Edgeworth series for lattice distributions". Annals of Statistics. 18 (2): 981–985. doi:10.1214/aos/1176347637. JSTOR 2242145.