Wrapped distribution

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In probability theory and directional statistics, a wrapped probability distribution is a continuous probability distribution that describes data points that lie on a unit n-sphere. In one dimension, a wrapped distribution will consist of points on the unit circle.

Any probability density function (pdf) p(\phi) on the line can be "wrapped" around the circumference of a circle of unit radius.[1] That is, the pdf of the wrapped variable

\theta=\phi \mod 2\pi in some interval of length 2\pi

is


p_w(\theta)=\sum_{k=-\infty}^\infty {p(\theta+2\pi k)}.

which is a periodic sum of period 2\pi. The preferred interval is generally (-\pi<\theta\le\pi) for which \ln(e^{i\theta})=\arg(e^{i\theta})=\theta

Theory[edit]

In most situations, a process involving circular statistics produces angles (\phi) which lie in the interval from negative infinity to positive infinity, and are described by an "unwrapped" probability density function p(\phi). However, a measurement will yield a "measured" angle \theta which lies in some interval of length 2\pi (for example [0,2\pi)). In other words, a measurement cannot tell if the "true" angle \phi has been measured or whether a "wrapped" angle \phi+2\pi a has been measured where a is some unknown integer. That is:

\theta=\phi+2\pi a.

If we wish to calculate the expected value of some function of the measured angle it will be:

\langle f(\theta)\rangle=\int_{-\infty}^\infty p(\phi)f(\phi+2\pi a)d\phi.

We can express the integral as a sum of integrals over periods of 2\pi (e.g. 0 to 2\pi):

\langle f(\theta)\rangle=\sum_{k=-\infty}^\infty \int_{2\pi k}^{2\pi(k+1)} p(\phi)f(\phi+2\pi a)d\phi.

Changing the variable of integration to \theta'=\phi-2\pi k and exchanging the order of integration and summation, we have

\langle f(\theta)\rangle= \int_0^{2\pi} p_w(\theta')f(\theta'+2\pi a')d\theta'

where p_w(\theta') is the pdf of the "wrapped" distribution and a' is another unknown integer (a'=a+k). It can be seen that the unknown integer a' introduces an ambiguity into the expectation value of f(\theta). A particular instance of this problem is encountered when attempting to take the mean of a set of measured angles. If, instead of the measured angles, we introduce the parameter z=e^{i\theta} it is seen that z has an unambiguous relationship to the "true" angle \phi since:

z=e^{i\theta}=e^{i\phi}.

Calculating the expectation value of a function of z will yield unambiguous answers:

\langle f(z)\rangle= \int_0^{2\pi} p_w(\theta')f(e^{i\theta'})d\theta'

and it is for this reason that the z parameter is the preferred statistical variable to use in circular statistical analysis rather than the measured angles \theta. This suggests, and it is shown below, that the wrapped distribution function may itself be expressed as a function of z so that:

\langle f(z)\rangle= \oint p_w(z)f(z)\,dz

where p_w(z) is defined such that p_w(\theta)\,d\theta=p_w(z)\,dz. This concept can be extended to the multivariate context by an extension of the simple sum to a number of F sums that cover all dimensions in the feature space:


p_w(\vec\theta)=\sum_{k_1,...,k_F=-\infty}^{\infty}{p(\vec\theta+2\pi k_1\mathbf{e}_1+\dots+2\pi k_F\mathbf{e}_F)}

where \mathbf{e}_k=(0,\dots,0,1,0,\dots,0)^{\mathsf{T}} is the kth Euclidean basis vector.

Expression in terms of characteristic functions[edit]

A fundamental wrapped distribution is the Dirac comb which is a wrapped delta function:

\Delta_{2\pi}(\theta)=\sum_{k=-\infty}^{\infty}{\delta(\theta+2\pi k)}.

Using the delta function, a general wrapped distribution can be written

p_w(\theta)=\sum_{k= -\infty}^{\infty}\int_{-\infty}^\infty p(\theta')\delta(\theta-\theta'+2\pi k)\,d\theta'.

Exchanging the order of summation and integration, any wrapped distribution can be written as the convolution of the "unwrapped" distribution and a Dirac comb:

p_w(\theta)=\int_{-\infty}^\infty p(\theta')\Delta_{2\pi}(\theta-\theta')\,d\theta'.

The Dirac comb may also be expressed as a sum of exponentials, so we may write:

p_w(\theta)=\frac{1}{2\pi}\,\int_{-\infty}^\infty p(\theta')\sum_{n=-\infty}^{\infty}e^{in(\theta-\theta')}\,d\theta'

again exchanging the order of summation and integration,

p_w(\theta)=\frac{1}{2\pi}\,\sum_{n=-\infty}^{\infty}\int_{-\infty}^\infty p(\theta')e^{in(\theta-\theta')}\,d\theta'

using the definition of \phi(s), the characteristic function of p(\theta) yields a Laurent series about zero for the wrapped distribution in terms of the characteristic function of the unwrapped distribution:[2]

p_w(\theta)=\frac{1}{2\pi}\,\sum_{n=-\infty}^{\infty} \phi(-n)\,e^{in\theta} = \frac{1}{2\pi}\,\sum_{n=-\infty}^{\infty} \phi(-n)\,z^n

or

p_w(z)=\frac{1}{2\pi i}\,\sum_{n=-\infty}^{\infty} \phi(-n)\,z^{n-1}.

By analogy with linear distributions, the \phi(m) are referred to as the characteristic function of the wrapped distribution[2] (or perhaps more accurately, the characteristic sequence). This is an instance of the Poisson summation formula and it can be seen that the Fourier coefficients of the Fourier series for the wrapped distribution are just the Fourier coefficients of the Fourier transform of the unwrapped distribution at integer values.

Moments[edit]

The moments of the wrapped distribution p_w(z) are defined as:


\langle z^m \rangle = \oint p_w(z)z^m \, dz.

Expressing p_w(z) in terms of the characteristic function and exchanging the order of integration and summation yields:


\langle z^m \rangle = \frac{1}{2\pi i}\sum_{n=-\infty}^\infty \phi(-n)\oint z^{m+n-1}\,dz.

From the theory of residues we have


\oint z^{m+n-1}\,dz = 2\pi i \delta_{m+n}

where \delta_k is the Kronecker delta function. It follows that the moments are simply equal to the characteristic function of the unwrapped distribution for integer arguments:


\langle z^m \rangle = \phi(m).

Entropy[edit]

The information entropy of a circular distribution with probability density f_w(\theta) is defined as:[1]

H = -\int_\Gamma f_w(\theta)\,\ln(f_w(\theta))\,d\theta

where \Gamma is any interval of length 2\pi. If both the probability density and its logarithm can be expressed as a Fourier series (or more generally, any integral transform on the circle) then the orthogonality property may be used to obtain a series representation for the entropy which may reduce to a closed form.

The moments of the distribution \phi(n) are the Fourier coefficients for the Fourier series expansion of the probability density:

f_w(\theta)=\frac{1}{2\pi}\sum_{n=-\infty}^\infty \phi_n e^{-in\theta}

If the logarithm of the probability density can also be expressed as a Fourier series:

\ln(f_w(\theta))=\sum_{m=-\infty}^\infty c_m e^{im\theta}

where

c_m=\frac{1}{2\pi}\int_\Gamma \ln(f_w(\theta))e^{-i m \theta}\,d\theta

Then, exchanging the order of integration and summation, the entropy may be written as:

H=-\frac{1}{2\pi}\sum_{m=-\infty}^\infty\sum_{n=-\infty}^\infty c_m \phi_n \int_\Gamma e^{i(m-n)\theta}\,d\theta

Using the orthogonality of the Fourier basis, the integral may be reduced to:

H=-\sum_{n=-\infty}^\infty c_n \phi_n

For the particular case when the probability density is symmetric about the mean, c_{-m}=c_m and the logarithm may be written:

\ln(f_w(\theta))= c_0 + 2\sum_{m=1}^\infty c_m \cos(m\theta)

and

c_m=\frac{1}{2\pi}\int_\Gamma \ln(f_w(\theta))\cos(m\theta)\,d\theta

and, since normalization requires that \phi_0=1, the entropy may be written:

H=-c_0-2\sum_{n=1}^\infty c_n \phi_n

See also[edit]

References[edit]

  1. ^ a b Mardia, Kantilal; Jupp, Peter E. (1999). Directional Statistics. Wiley. ISBN 978-0-471-95333-3. Retrieved 2011-07-19. 
  2. ^ a b Mardia, K. (1972). Statistics of Directional Data. New York: Academic press. 

External links[edit]