# Nested sampling algorithm

(Redirected from Nested sampling)

The nested sampling algorithm is a computational approach to the problem of comparing models in Bayesian statistics, developed in 2004 by physicist John Skilling.[1]

## Background

Bayes' theorem can be applied to a pair of competing models ${\displaystyle M_{1}}$ and ${\displaystyle M_{2}}$ for data ${\displaystyle D}$, one of which may be true (though which one is unknown) but which both cannot be true simultaneously. The posterior probability for ${\displaystyle M_{1}}$ may be calculated as:

{\displaystyle {\begin{aligned}P(M_{1}\mid D)&={\frac {P(D\mid M_{1})P(M_{1})}{P(D)}}\\&={\frac {P(D\mid M_{1})P(M_{1})}{P(D\mid M_{1})P(M_{1})+P(D\mid M_{2})P(M_{2})}}\\&={\frac {1}{1+{\frac {P(D\mid M_{2})}{P(D\mid M_{1})}}{\frac {P(M_{2})}{P(M_{1})}}}}\end{aligned}}}

Given no a priori information in favor of ${\displaystyle M_{1}}$ or ${\displaystyle M_{2}}$, it is reasonable to assign prior probabilities ${\displaystyle P(M_{1})=P(M_{2})=1/2}$, so that ${\displaystyle P(M_{2})/P(M_{1})=1}$. The remaining Bayes factor ${\displaystyle P(D\mid M_{2})/P(D\mid M_{1})}$ is not so easy to evaluate, since in general it requires marginalizing nuisance parameters. Generally, ${\displaystyle M_{1}}$ has a set of parameters that can be grouped together and called ${\displaystyle \theta }$, and ${\displaystyle M_{2}}$ has its own vector of parameters that may be of different dimensionality, but is still termed ${\displaystyle \theta }$. The marginalization for ${\displaystyle M_{1}}$ is

${\displaystyle P(D\mid M_{1})=\int d\theta \,P(D\mid \theta ,M_{1})P(\theta \mid M_{1})}$

and likewise for ${\displaystyle M_{2}}$. This integral is often analytically intractable, and in these cases it is necessary to employ a numerical algorithm to find an approximation. The nested sampling algorithm was developed by John Skilling specifically to approximate these marginalization integrals, and it has the added benefit of generating samples from the posterior distribution ${\displaystyle P(\theta \mid D,M_{1})}$.[2] It is an alternative to methods from the Bayesian literature[3] such as bridge sampling and defensive importance sampling.

Here is a simple version of the nested sampling algorithm, followed by a description of how it computes the marginal probability density ${\displaystyle Z=P(D\mid M)}$ where ${\displaystyle M}$ is ${\displaystyle M_{1}}$ or ${\displaystyle M_{2}}$:

  Start with ${\displaystyle N}$ points ${\displaystyle \theta _{1},\ldots ,\theta _{N}}$ sampled from prior.
for ${\displaystyle i=1}$ to ${\displaystyle j}$ do        % The number of iterations j is chosen by guesswork.
${\displaystyle L_{i}:=\min(}$current likelihood values of the points${\displaystyle )}$;
${\displaystyle X_{i}:=\exp(-i/N);}$
${\displaystyle w_{i}:=X_{i-1}-X_{i}}$
${\displaystyle Z:=Z+L_{i}\cdot w_{i};}$
Save the point with least likelihood as a sample point with weight ${\displaystyle w_{i}}$.
Update the point with least likelihood with some Markov chain Monte Carlo steps according to the prior, accepting only steps that
keep the likelihood above ${\displaystyle L_{i}}$.
end
return ${\displaystyle Z}$;


At each iteration, ${\displaystyle X_{i}}$ is an estimate of the amount of prior mass covered by the hypervolume in parameter space of all points with likelihood greater than ${\displaystyle \theta _{i}}$. The weight factor ${\displaystyle w_{i}}$ is an estimate of the amount of prior mass that lies between two nested hypersurfaces ${\displaystyle \{\theta \mid P(D\mid \theta ,M)=P(D\mid \theta _{i-1},M)\}}$ and ${\displaystyle \{\theta \mid P(D\mid \theta ,M)=P(D\mid \theta _{i},M)\}}$. The update step ${\displaystyle Z:=Z+L_{i}w_{i}}$ computes the sum over ${\displaystyle i}$ of ${\displaystyle L_{i}w_{i}}$ to numerically approximate the integral

{\displaystyle {\begin{aligned}P(D\mid M)&=\int P(D\mid \theta ,M)P(\theta \mid M)\,d\theta \\&=\int P(D\mid \theta ,M)\,dP(\theta \mid M)\end{aligned}}}

The idea is to subdivide the range of ${\displaystyle f(\theta )=P(D\mid \theta ,M)}$ and estimate, for each interval ${\displaystyle [f(\theta _{i-1}),f(\theta _{i})]}$, how likely it is a priori that a randomly chosen ${\displaystyle \theta }$ would map to this interval. This can be thought of as a Bayesian's way to numerically implement Lebesgue integration.[4]

## Implementations

Example implementations demonstrating the nested sampling algorithm are publicly available for download, written in several programming languages.

## Applications

Since nested sampling was proposed in 2004, it has been used in many aspects of the field of astronomy. One paper suggested using nested sampling for cosmological model selection and object detection, as it "uniquely combines accuracy, general applicability and computational feasibility."[12] A refinement of the algorithm to handle multimodal posteriors has been suggested as a means to detect astronomical objects in extant datasets.[13] Other applications of nested sampling are in the field of finite element updating where the algorithm is used to choose an optimal finite element model, and this was applied to structural dynamics.[14]

## References

1. ^ Skilling, John (2004). "Nested Sampling". AIP Conference Proceedings. 735: 395–405. doi:10.1063/1.1835238.
2. ^ Skilling, John (2006). "Nested Sampling for General Bayesian Computation". Bayesian Analysis. 1 (4): 833–860. doi:10.1214/06-BA127.
3. ^ Chen, Ming-Hui, Shao, Qi-Man, and Ibrahim, Joseph George (2000). Monte Carlo methods in Bayesian computation. Springer. ISBN 978-0-387-98935-8.CS1 maint: Multiple names: authors list (link)
4. ^ Jasa, Tomislav; Xiang, Ning (2012). "Nested sampling applied in Bayesian room-acoustics decay analysis". Journal of the Acoustical Society of America. 132: 3251–3262. Bibcode:2012ASAJ..132.3251J. doi:10.1121/1.4754550.
5. ^ John Skilling website
6. ^ Nested sampling algorithm in Haskell at Hackage
7. ^ Nested sampling algorithm in R on Bojan Nikolic website
8. ^ Nested sampling algorithm in R on GitHub
9. ^ Nested sampling algorithm in C++ on GitHub
10. ^ Nested sampling algorithm in Python on GitHub
11. ^ Nested sampling algorithm for materials simulation on GitHub
12. ^ Mukherjee, P.; Parkinson, D.; Liddle, A.R. (2006). "A Nested Sampling Algorithm for Cosmological Model Selection". Astrophysical Journal. 638 (2): 51–54. arXiv:astro-ph/0508461. Bibcode:2006ApJ...638L..51M. doi:10.1086/501068.
13. ^ Feroz, F.; Hobson, M.P. (2008). "Multimodal nested sampling: an efficient and robust alternative to Markov Chain Monte Carlo methods for astronomical data analyses". MNRAS. 384 (2): 449–463. arXiv:0704.3704. Bibcode:2008MNRAS.384..449F. doi:10.1111/j.1365-2966.2007.12353.x.
14. ^ Mthembu, L.; Marwala, T.; Friswell, M.I.; Adhikari, S. (2011). "Model selection in finite element model updating using the Bayesian evidence statistic". Mechanical Systems and Signal Processing. 25 (7): 2399–2412. Bibcode:2011MSSP...25.2399M. doi:10.1016/j.ymssp.2011.04.001.