# Heavy-tailed distribution

(Redirected from Heavy-tailed)

In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded:[1] that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.

There are three important subclasses of heavy-tailed distributions, the fat-tailed distributions, the long-tailed distributions and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class.

There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power moments finite; and some others to those distributions that do not have a variance. The definition given in this article is the most general in use, and includes all distributions encompassed by the alternative definitions, as well as those distributions such as log-normal that possess all their power moments, yet which are generally acknowledged to be heavy-tailed. (Occasionally, heavy-tailed is used for any distribution that has heavier tails than the normal distribution.)

## Definition of heavy-tailed distribution

The distribution of a random variable X with distribution function F is said to have a heavy right tail if[1]

$\lim_{x \to \infty} e^{\lambda x}\Pr[X>x] = \infty \quad \mbox{for all } \lambda>0.\,$

This is also written in terms of the tail distribution function

$\overline{F}(x) \equiv \Pr[X>x] \,$

as

$\lim_{x \to \infty} e^{\lambda x}\overline{F}(x) = \infty \quad \mbox{for all } \lambda>0.\,$

This is equivalent to the statement that the moment generating function of F, MF(t), is infinite for all t > 0.[2]

The definitions of heavy-tailed for left-tailed or two tailed distributions are similar.

## Definition of long-tailed distribution

The distribution of a random variable X with distribution function F is said to have a long right tail[1] if for all t > 0,

$\lim_{x \to \infty} \Pr[X>x+t|X>x] =1, \,$

or equivalently

$\overline{F}(x+t) \sim \overline{F}(x) \quad \mbox{as } x \to \infty. \,$

This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level: if you know the situation is bad, it is probably worse than you think.

All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.

## Subexponential distributions

Subexponentiality is defined in terms of convolutions of probability distributions. For two independent, identically distributed random variables $X_1,X_2$ with common distribution function $F$ the convolution of $F$ with itself, $F^{*2}$ is defined, using Lebesgue–Stieltjes integration, by:

$\Pr[X_1+X_2 \leq x] = F^{*2}(x) = \int_{- \infty}^\infty F(x-y)\,dF(y).$

The n-fold convolution $F^{*n}$ is defined in the same way. The tail distribution function $\overline{F}$ is defined as $\overline{F}(x) = 1-F(x)$.

A distribution $F$ on the positive half-line is subexponential[1] if

$\overline{F^{*2}}(x) \sim 2\overline{F}(x) \quad \mbox{as } x \to \infty.$

This implies[3] that, for any $n \geq 1$,

$\overline{F^{*n}}(x) \sim n\overline{F}(x) \quad \mbox{as } x \to \infty.$

The probabilistic interpretation[3] of this is that, for a sum of $n$ independent random variables $X_1,\ldots,X_n$ with common distribution $F$,

$\Pr[X_1+ \cdots +X_n>x] \sim \Pr[\max(X_1, \ldots,X_n)>x] \quad \text{as } x \to \infty.$

This is often known as the principle of the single big jump.[4]

A distribution $F$ on the whole real line is subexponential if the distribution $F I([0,\infty))$ is.[5] Here $I([0,\infty))$ is the indicator function of the positive half-line. Alternatively, a random variable $X$ supported on the real line is subexponential if and only if $X^+ = \max(0,X)$ is subexponential.

All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.

## Common heavy-tailed distributions

All commonly used heavy-tailed distributions are subexponential.[3]

Those that are one-tailed include:

Those that are two-tailed include:

## Estimating the tail-index

### Pickands tail-index

With $(X_n , n \geq 1)$ a random sequence of independent and same density function $F \in D(H(\xi))$, the Maximum Attraction Domain of the generalized extreme value density $H$, where $\xi \in \mathbb{R}$. If $\lim_{n\to\infty} k(n) = \infty$ and $\lim_{n\to\infty} \frac{k(n)}{n}= 0$, then the Pickands tail-index estimation is :[3]

$\xi^{Pickands}_{(k(n),n)} =\frac{1}{\ln 2} \ln \left( \frac{X_{(n-k(n)+1,n)} - X_{(n-2k(n)+1,n)}}{X_{(n-2k(n)+1,n)} - X_{(n-4k(n)+1,n)}}\right)$

where $X_{(n-k(n)+1,n)}=\max \left(X_{n-k(n)+1},\ldots ,X_{n}\right)$. This estimator converge in probability to $\xi$.

### Hill tail-index

With $(X_n , n \geq 1)$ a random sequence of independent and same density function $F \in D(H(\xi))$, the Maximum Attraction Domain of the generalized extreme value density $H$, where $\xi \in \mathbb{R}$. If $\lim_{n\to\infty} k(n) = \infty$ and $\lim_{n\to\infty} \frac{k(n)}{n}= 0$, then the Hill tail-index estimation is :[3]

$\xi^{Hill}_{(k(n),n)} = \frac{1}{k(n)} \sum_{i=n-k(n)+1}^{n} \ln(X_{(i,n)}) - \ln (X_{(n-k(n)+1,n)})$

where $X_{(n-k(n)+1,n)}=\max \left(X_{n-k(n)+1},\ldots ,X_{n}\right)$. This estimator converge in probability to $\xi$.