Rectified Gaussian distribution

In probability theory, the rectified Gaussian distribution is a modification of the Gaussian distribution when its negative elements are reset to 0 (analogous to an electronic rectifier). It is essentially a mixture of a discrete distribution (constant 0) and a continuous distribution (a truncated Gaussian distribution with interval $(0,\infty )$ ) as a result of censoring.

Density function

The probability density function of a rectified Gaussian distribution, for which random variables X having this distribution, derived from the normal distribution ${\mathcal {N}}(\mu ,\sigma ^{2}),$ are displayed as $X\sim {\mathcal {N}}^{\textrm {R}}(\mu ,\sigma ^{2})$ , is given by

$f(x;\mu ,\sigma ^{2})=\Phi (-{\frac {\mu }{\sigma }})\delta (x)+{\frac {1}{\sqrt {2\pi \sigma ^{2}}}}\;e^{-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}}{\textrm {U}}(x).$  A comparison of Gaussian distribution, rectified Gaussian distribution, and truncated Gaussian distribution.

Here, $\Phi (x)$ is the cumulative distribution function (cdf) of the standard normal distribution:

$\Phi (x)={\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{x}e^{-t^{2}/2}\,dt\quad x\in \mathbb {R} ,$ $\delta (x)$ is the Dirac delta function

$\delta (x)={\begin{cases}+\infty ,&x=0\\0,&x\neq 0\end{cases}}$ and, ${\textrm {U}}(x)$ is the unit step function:

${\textrm {U}}(x)={\begin{cases}0,&x\leq 0,\\1,&x>0.\end{cases}}$ Mean and variance

Since the unrectified normal distribution has mean $\mu$ and since in transforming it to the rectified distribution some probability mass has been shifted to a higher value (from negative values to 0), the mean of the rectified distribution is greater than $\mu .$ Since the rectified distribution is formed by moving some of the probability mass toward the rest of the probability mass, the rectification is a mean-preserving contraction combined with a mean-changing rigid shift of the distribution, and thus the variance is decreased; therefore the variance of the rectified distribution is less than $\sigma ^{2}.$ Generating values

To generate values computationally, one can use

$s\sim {\mathcal {N}}(\mu ,\sigma ^{2}),\quad x={\textrm {max}}(0,s),$ and then

$x\sim {\mathcal {N}}^{\textrm {R}}(\mu ,\sigma ^{2}).$ Application

A rectified Gaussian distribution is semi-conjugate to the Gaussian likelihood, and it has been recently applied to factor analysis, or particularly, (non-negative) rectified factor analysis. Harva proposed a variational learning algorithm for the rectified factor model, where the factors follow a mixture of rectified Gaussian; and later Meng proposed an infinite rectified factor model coupled with its Gibbs sampling solution, where the factors follow a Dirichlet process mixture of rectified Gaussian distribution, and applied it in computational biology for reconstruction of gene regulatory networks.

Extension to general bounds

An extension to the rectified Gaussian distribution was proposed by Palmer et al., allowing rectification between arbitrary lower and upper bounds. For lower and upper bounds $a$ and $b$ respectively, the cdf, $F_{R}(x|\mu ,\sigma ^{2})$ is given by:

$F_{R}(x|\mu ,\sigma ^{2})={\begin{cases}0,&x where $\Phi (x|\mu ,\sigma ^{2})$ is the cdf of a normal distribution with mean $\mu$ and variance $\sigma ^{2}$ . The mean and variance of the rectified distribution is calculated by first transforming the constraints to be acting on a standard normal distribution:

$c={\frac {a-\mu }{\sigma }},\qquad d={\frac {b-\mu }{\sigma }}.$ Using the transformed constraints, the mean and variance, $\mu _{R}$ and $\sigma _{R}^{2}$ respectively, are then given by:

$\mu _{t}={\frac {1}{\sqrt {2\pi }}}\left(e^{\left(-{\frac {c^{2}}{2}}\right)}-e^{\left(-{\frac {d^{2}}{2}}\right)}\right)+{\frac {c}{2}}\left(1+{\textrm {erf}}\left({\frac {c}{\sqrt {2}}}\right)\right)+{\frac {d}{2}}\left(1-{\textrm {erf}}\left({\frac {d}{\sqrt {2}}}\right)\right),$ {\begin{aligned}\sigma _{t}^{2}&={\frac {\mu _{t}^{2}+1}{2}}\left({\textrm {erf}}\left({\frac {d}{\sqrt {2}}}\right)-{\textrm {erf}}\left({\frac {c}{\sqrt {2}}}\right)\right)-{\frac {1}{\sqrt {2\pi }}}\left(\left(d-2\mu _{t}\right)e^{\left(-{\frac {d^{2}}{2}}\right)}-\left(c-2\mu _{t}\right)e^{\left(-{\frac {c^{2}}{2}}\right)}\right)\\&+{\frac {\left(c-\mu _{t}\right)^{2}}{2}}\left(1+{\textrm {erf}}\left({\frac {c}{\sqrt {2}}}\right)\right)+{\frac {\left(d-\mu _{t}\right)^{2}}{2}}\left(1-{\textrm {erf}}\left({\frac {d}{\sqrt {2}}}\right)\right),\end{aligned}} $\mu _{R}=\mu +\sigma \mu _{t},$ $\sigma _{R}^{2}=\sigma ^{2}\sigma _{t}^{2},$ where erf is the error function. This distribution was used by Palmer et al. for modelling physical resource levels, such as the quantity of liquid in a vessel, which is bounded by both 0 and the capacity of the vessel.