Uniform distribution (discrete)

From Wikipedia, the free encyclopedia

Jump to: navigation, search
discrete uniform
Probability mass function
Discrete uniform probability mass function for n = 5
n = 5 where n = b − a + 1
Cumulative distribution function
Discrete uniform cumulative distribution function for n = 5
parameters: a \in (\dots,-2,-1,0,1,2,\dots)\,
b \in (\dots,-2,-1,0,1,2,\dots)\,
n=b-a+1\,
support: k \in \{a,a+1,\dots,b-1,b\}\,
pmf: 
    \begin{matrix}
    \frac{1}{n} & \mbox{for }a\le k \le b\ \\0 & \mbox{otherwise }
    \end{matrix}
cdf: 
    \begin{matrix}
    0 & \mbox{for }k<a\\ \frac{\lfloor k \rfloor -a+1}{n} & \mbox{for }a \le k \le b \\1 & \mbox{for }k>b
    \end{matrix}
mean: \frac{a+b}{2}\,
median: \frac{a+b}{2}\,
mode: N/A
variance: \frac{(b-a+1)^2-1}{12}=\frac{n^2-1}{12},
skewness: 0\,
kurtosis: -\frac{6(n^2+1)}{5(n^2-1)}\,
entropy: \ln(n)\,
mgf: \frac{e^{at}-e^{(b+1)t}}{n(1-e^t)}\,
cf: \frac{e^{iat}-e^{i(b+1)t}}{n(1-e^{it})}

In probability theory and statistics, the discrete uniform distribution is a discrete probability distribution that can be characterized by saying that all values of a finite set of possible values are equally probable.

If a random variable has any of n possible values k_1,k_2,\dots,k_n that are equally probable, then it has a discrete uniform distribution. The probability of any outcome ki  is 1 / n. A simple example of the discrete uniform distribution is throwing a fair die. The possible values of k are 1, 2, 3, 4, 5, 6; and each time the die is thrown, the probability of a given score is 1/6. If two dice are thrown, then the uniform distribution no longer fits, as values from 2 to 12 have varying probabilities.

In case the values of a random variable with a discrete uniform distribution are real, it is possible to express the cumulative distribution function in terms of the degenerate distribution; thus

F(k;a,b,n)={1\over n}\sum_{i=1}^n H(k-k_i)

where the Heaviside step function H(xx0) is the CDF of the degenerate distribution centered at x0. This assumes that consistent conventions are used at the transition points.

Contents

[edit] Estimation of maximum

This example is described by saying that a sample of k observations are obtained from a uniform distribution on the integers 1,2,\dots,N, with the problem being to estimate the unknown maximum N. This problem is commonly known as the German tank problem, following the application of maximum estimation to estimates of German tank production during World War II.

The UMVU estimator for the maximum is given by

\hat{N}=\frac{k+1}{k} m - 1 = m + \frac{m}{k} - 1

where m is the sample maximum and k is the sample size, sampling without replacement.[1][2] This can be seen as a very simple case of maximum spacing estimation.

The formula may be understood intuitively as:

"The sample maximum plus the average gap between observations in the sample",

the gap being added to compensate for the negative bias of the sample maximum as an estimator for the population maximum.[notes 1]

This has a variance of[1]

\frac{1}{k}\frac{(N-k)(N+1)}{(k+2)} \approx \frac{N^2}{k^2} \text{ for small samples } k \ll N

so a standard deviation of approximately N / k, the (population) average size of a gap between samples; compare \frac{m}{k} above.

The sample maximum is the maximum likelihood estimator for the population maximum, but, as discussed above, it is biased.

If samples are not numbered but are recognizable or markable, one can instead estimate population size via the capture-recapture method.

[edit] Random permutation

See rencontres numbers for an account of the probability distribution of the number of fixed points of a uniformly distributed random permutation.

[edit] See also

[edit] Notes

  1. ^ The sample maximum is never more than the population maximum, but can be less, hence it is a biased estimator: it will tend to underestimate the population maximum.

[edit] References

  1. ^ a b Johnson, Roger (1994), "Estimating the Size of a Population", Teaching Statistics 16 (2 (Summer)), doi:10.1111/j.1467-9639.1994.tb00688.x 
  2. ^ Johnson, Roger (2006), "Estimating the Size of a Population", Getting the Best from Teaching Statistics, http://www.rsscse.org.uk/ts/gtb/johnson.pdf