Generalized Pareto distribution

From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article is about a particular family of continuous distributions referred to as the generalized Pareto distribution. For the hierarchy of generalized Pareto distributions, see Pareto distribution.
Generalized Pareto distribution
Parameters

\mu \in (-\infty,\infty) \, location (real)
\sigma \in (0,\infty)    \, scale (real)

\xi\in (-\infty,\infty)  \, shape (real)
Support

x \geqslant \mu\,\;(\xi \geqslant 0)

\mu \leqslant x \leqslant \mu-\sigma/\xi\,\;(\xi < 0)
pdf

\frac{1}{\sigma}(1 + \xi z )^{-(1/\xi +1)}

where z=\frac{x-\mu}{\sigma}
CDF 1-(1+\xi z)^{-1/\xi} \,
Mean \mu + \frac{\sigma}{1-\xi}\, \; (\xi < 1)
Median \mu + \frac{\sigma( 2^{\xi} -1)}{\xi}
Mode
Variance \frac{\sigma^2}{(1-\xi)^2(1-2\xi)}\, \; (\xi < 1/2)
Skewness \frac{2(1+\xi)\sqrt(1-{2\xi})}{(1-3\xi)}\,\;(\xi<1/3)
Ex. kurtosis \frac{3(1-2\xi)(2\xi^2+\xi+3)}{(1-3\xi)(1-4\xi)}-3\,\;(\xi<1/4)
Entropy
MGF e^{\theta\mu}\,\sum_{j=0}^\infty [\frac{(\theta\sigma)^j}{\pi_{k=0}^j(1-k\xi)}], \;(k\xi<1)
CF e^{it\mu}\,\sum_{j=0}^\infty [\frac{(it\sigma)^j}{\pi_{k=0}^j(1-k\xi)}], \;(k\xi<1)

In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location \mu, scale \sigma, and shape \xi.[1][2] Sometimes it is specified by only scale and shape[3] and sometimes only by its shape parameter. Some references give the shape parameter as  \kappa =  - \xi \,.[4]

Definition[edit]

The standard cumulative distribution function (cdf) of the GPD is defined by[5]

F_{\xi}(z) = \begin{cases}
1 - \left(1+ \xi z\right)^{-1/\xi} & \text{for }\xi \neq 0, \\
1 - e^{-z} & \text{for }\xi = 0.
\end{cases}

where the support is  z \geq 0 for  \xi \geq 0 and  0 \leq z \leq - 1 /\xi for  \xi < 0.

f_{\xi}(z) = \begin{cases}
(\xi  z+1)^{-\frac{\xi +1}{\xi }} & \text{for }\xi \neq 0, \\
e^{-z} & \text{for }\xi = 0.
\end{cases}

Differential equation[edit]

The cdf of the GPD is a solution of the following differential equation:

\left\{\begin{array}{l}
(\xi  z+1) f_{\xi}'(z)+(\xi +1) f_{\xi}(z)=0, \\
f_{\xi}(0)=1
\end{array}\right\}

Characterization[edit]

The related location-scale family of distributions is obtained by replacing the argument z by \frac{x-\mu}{\sigma} and adjusting the support accordingly: The cumulative distribution function is

F_{(\xi,\mu,\sigma)}(x) = \begin{cases}
1 - \left(1+ \frac{\xi(x-\mu)}{\sigma}\right)^{-1/\xi} & \text{for }\xi \neq 0, \\
1 - \exp \left(-\frac{x-\mu}{\sigma}\right) & \text{for }\xi = 0.
\end{cases}

for  x \geqslant \mu when  \xi \geqslant 0 \,, and  \mu \leqslant x \leqslant \mu - \sigma /\xi when  \xi < 0, where \mu\in\mathbb R, \sigma>0, and \xi\in\mathbb R.

The probability density function (pdf) is

f_{(\xi,\mu,\sigma)}(x) = \frac{1}{\sigma}\left(1 + \frac{\xi (x-\mu)}{\sigma}\right)^{\left(-\frac{1}{\xi} - 1\right)},

or equivalently

f_{(\xi,\mu,\sigma)}(x) = \frac{\sigma^{\frac{1}{\xi}}}{\left(\sigma + \xi (x-\mu)\right)^{\frac{1}{\xi}+1}},

again, for  x \geqslant \mu when  \xi \geqslant 0, and  \mu \leqslant x \leqslant \mu - \sigma /\xi when  \xi < 0.

The pdf is a solution of the following differential equation:

\left\{\begin{array}{l}
f'(x) (-\mu \xi +\sigma+\xi x)+(\xi+1) f(x)=0, \\
f(0)=\frac{\left(1-\frac{\mu \xi}{\sigma}\right)^{-\frac{1}{\xi }-1}}{\sigma}
\end{array}\right\}

Characteristic and Moment Generating Functions[edit]

The characteristic and moment generating functions are derived and skewness and kurtosis are obtained from MGF by Muraleedharan and Guedes Soares[6]

Special cases[edit]

Generating generalized Pareto random variables[edit]

If U is uniformly distributed on (0, 1], then

 X = \mu + \frac{\sigma (U^{-\xi}-1)}{\xi} \sim \mbox{GPD}(\mu, \sigma, \xi \neq 0)

and

 X = \mu - \sigma \ln(U) \sim \mbox{GPD}(\mu,\sigma,\xi =0).

Both formulas are obtained by inversion of the cdf.

In Matlab Statistics Toolbox, you can easily use "gprnd" command to generate generalized Pareto random numbers.

With GNU R you can use the packages POT or evd with the "rgpd" command (see for exact usage: http://rss.acs.unt.edu/Rdoc/library/POT/html/simGPD.html)

See also[edit]

Notes[edit]

  1. ^ Coles, Stuart (2001-12-12). An Introduction to Statistical Modeling of Extreme Values. Springer. p. 75. ISBN 9781852334598. 
  2. ^ Dargahi-Noubary, G. R. (1989). "On tail estimation: An improved method". Mathematical Geology 21 (8): 829–842. doi:10.1007/BF00894450.  edit
  3. ^ Hosking, J. R. M.; Wallis, J. R. (1987). "Parameter and Quantile Estimation for the Generalized Pareto Distribution". Technometrics 29 (3): 339–349. doi:10.2307/1269343.  edit
  4. ^ Davison, A. C. (1984-09-30). "Modelling Excesses over High Thresholds, with an Application". In de Oliveira, J. Tiago. Statistical Extremes and Applications. Kluwer. p. 462. ISBN 9789027718044. 
  5. ^ Embrechts, Paul; Klüppelberg, Claudia; Mikosch, Thomas (1997-01-01). Modelling extremal events for insurance and finance. p. 162. ISBN 9783540609315. 
  6. ^ Muraleedharan, G.; C, Guedes Soares (2014). "Characteristic and Moment Generating Functions of Generalised Pareto(GP3) and Weibull Distributions". Journal of Scientific Research and Reports 3 (14): 1861–1874. 

References[edit]

  • N. L. Johnson, S. Kotz, and N. Balakrishnan (1994). Continuous Univariate Distributions Volume 1, second edition. New York: Wiley. ISBN 0-471-58495-9.  Chapter 20, Section 12: Generalized Pareto Distributions.
  • Arnold, B. C. and Laguna, L. (1977). On generalized Pareto distributions with applications to income data. Ames, Iowa: Iowa State University, Department of Economics. 

External links[edit]