# Van Houtum distribution

Parameters Probability mass function $p_a,p_b \in [0,1] \text{ and } a,b \in \mathbb{Z} \text{ with } a\leq b$ $k \in \{a,a+1,\dots,b-1,b\}\,$ $\begin{cases} p_a & \text{if } u=a; \\ p_b & \text{if } u=b \\ \frac{1-p_a-p_b}{b-a-1} & \text{if } a $\begin{cases} 0 & \textrm{if } u $ap_a+bp_b+(1-p_a-p_b)\frac{a+b}{2}$ N/A $\ a^2p_a+b^2p_b - {} \$ $\frac{(a+b)(1-p_a-p_b)+2ap_a+2bp_b}{4}$ ${} + \frac{b(2b-1)(b-1)-a(2a+1)(a+1)}{6}$ $\ -p_a \ln(p_a)-p_b\ln(p_b)- {} \$ $(1-p_a-p_b)\ln\left(\frac{1-p_a-p_b}{b-a-1}\right)$ $e^{ta}p_a+e^tbp_b+\frac{1-p_a-p_b}{b-a-1}\frac{e^{(a+1)t}-e^{bt}}{e^t-1}$ $e^{ita}p_a+e^{itb}p_b+\frac{1-p_a-p_b}{b-a-1}\frac{e^{(a+1)it}-e^{bit}}{e^{it}-1}$

In probability theory and statistics, the Van Houtum distribution is a discrete probability distribution named after prof. Geert-Jan van Houtum.[1] It can be characterized by saying that all values of a finite set of possible values are equally probable, except for the smallest and largest element of this set. Since the Van Houtum distribution is a generalization of the discrete uniform distribution, i.e. it is uniform except possibly at its boundaries, it is sometimes also referred to as quasi-uniform.

It is regularly the case that the only available information concerning some discrete random variable are its first two moments. The Van Houtum distribution can be used to fit a distribution with finite support on these moments.

A simple example of the Van Houtum distribution arises when throwing a loaded dice which has been tampered with to land on a 6 twice as often as on a 1. The possible values of the sample space are 1, 2, 3, 4, 5 and 6. Each time the die is thrown, the probability of throwing a 2, 3, 4 or 5 is 1/6; the probability of a 1 is 1/9 and the probability of throwing a 6 is 2/9.

## Probability mass function

A random variable U has a Van Houtum (a, b, pa, pb) distribution if its probability mass function is

$\Pr(U=u) = \begin{cases} p_a & \text{if } u=a; \\[8pt] p_b & \text{if } u=b \\[8pt] \dfrac{1-p_a-p_b}{b-a-1} & \text{if } a

## Fitting procedure

Suppose a random variable $X$ has mean $\mu$ and squared coefficient of variation $c^2$. Let $U$ be a Van Houtum distributed random variable. Then the first two moments of $U$ match the first two moments of $X$ if $a$, $b$, $p_a$ and $p_b$ are chosen such that:[2]

\begin{align} a &= \left\lceil \mu - \frac{1}{2} \left\lceil \sqrt{1+12c^2\mu^2} \right\rceil \right\rceil \\[8pt] b &= \left\lfloor \mu + \frac{1}{2} \left\lceil \sqrt{1+12c^2\mu^2} \right\rceil \right\rfloor \\[8pt] p_b &= \frac{(c^2+1)\mu^2-A-(a^2-A)(2\mu-a-b)/(a-b)}{a^2+b^2-2A} \\[8pt] p_a &= \frac{2\mu-a-b}{a-b}+p_b \\[12pt] \text{where } A & = \frac{2a^2+a+2ab-b+2b^2}{6}. \end{align}

There does not exist a Van Houtum distribution for every combination of $\mu$ and $c^2$. By using the fact that for any real mean $\mu$ the discrete distribution on the integers that has minimal variance is concentrated on the integers $\lfloor \mu \rfloor$ and $\lceil \mu \rceil$, it is easy to verify that a Van Houtum distribution (or indeed any discrete distribution on the integers) can only be fitted on the first two moments if [3]

$c^2\mu^2 \geq (\mu-\lfloor \mu \rfloor)(1+\mu-\lceil \mu \rceil)^2+(\mu-\lfloor \mu \rfloor)^2(1+\mu-\lceil \mu \rceil).$