# A priori probability

Not to be confused with prior probability.

An a priori probability is a probability that is derived purely by deductive reasoning.[1] One way of deriving a priori probabilities is the principle of indifference, which has the character of saying that, if there are N mutually exclusive and exhaustive events and if they are equally likely, then the probability of a given event occurring is 1/N. Similarly the probability of one of a given collection of K events is K/N.

One disadvantage of defining probabilities in the above way is that it applies only to finite collections of events.

In Bayesian inference, the terms "uninformative priors" or "objective priors" refer to particular choices of a priori probabilities.[2] Note that "prior probability" is a broader concept.

Similar to the distinction in philosophy between a priori and a posteriori, in Bayesian inference a priori denotes general knowledge about the data distribution before making an inference, while a posteriori denotes knowledge that incorporates the results of making an inference.[3]

## A Priori Probability in Statistical Mechanics

The a priori probability has an important application in statistical mechanics. The classical version is defined as the ratio of the number of elementary events (e.g. the number of times a die is thrown) to the total number of events. In the case of the die each elementary event has the same probability -- thus the probability of each outcome of throwing a (perfect) die is 1/6. Each face of the die appears with equal probability -- probability being a measure defined for each event.

In statistical mechanics, e.g. that of a gas contained in a finite volume, both the spatial coordinates ${\displaystyle q_{i}}$ and the momentum coordinates ${\displaystyle p_{i}}$ of the individual gas elements (atoms or molecules) are finite in the phase space spanned by these coordinates. In analogy to the case of the die, the a priori probability is here (in the case of a continuum) proportional to the phase space volume element ${\displaystyle \Delta q\Delta p}$, and is the number of standing waves (states´´) therein, where ${\displaystyle \Delta q}$ is the range of the variable ${\displaystyle q}$ and ${\displaystyle \Delta p}$ is the range of the variable ${\displaystyle p}$ (here for simplicity considered in one dimension). An important consequence is a result known as Liouville's theorem (frequently also under different names), i.e. the time independence of this phase space volume element and thus of the a priori probability. A time dependence of this quantity would imply known information about the dynamics of the system, and hence would not be an a priori probability.[4][5] Thus the region

${\displaystyle \Omega :={\frac {\Delta q\Delta p}{\int \Delta q\Delta p}},\;\;\;\int \Delta q\Delta p=const.,}$

when differentiated with respect to time ${\displaystyle t}$ yields zero: The volume at time ${\displaystyle t}$ is the same as at time zero. One describes this also as conservation of information. In quantum mechanics one has an analogous conservation law. In this case the phase space region is replaced by a subspace of the space of states expressed in terms of a projection operator ${\displaystyle P}$, and instead of the probability in phase space one has the probability density

${\displaystyle \Sigma :={\frac {P}{TrP}},\;\;\;N=TrP=const.,}$

where ${\displaystyle N}$ is the dimensionality of the subspace. The conservation law in this case is expressed by the unitarity of the S-matrix. In either case the considerations assume a closed isolated system.

## Example

The following example illustrates the a priori probability (or a priori weighting) in (a) classical and (b) quantal contexts.

(a) Classical a priori probability

Consider the rotational energy E of a diatomic molecule with moment of inertia I in spherical polar coordinates ${\displaystyle \theta ,\phi }$, i.e.

${\displaystyle E={\frac {1}{2I}}(p_{\theta }^{2}+{\frac {p_{\phi }^{2}}{\sin ^{2}\theta }}).}$

The ${\displaystyle (p_{\theta },p_{\phi })}$-curve for constant E and ${\displaystyle \theta }$ is an ellipse of area

${\displaystyle \oint dp_{\theta }dp_{\phi }=\pi {\sqrt {2IE}}{\sqrt {2IE}}\sin \theta =2\pi IE\sin \theta }$.

By integrating over ${\displaystyle \theta }$ and ${\displaystyle \phi }$ the total volume of phase space covered for constant energy E is

${\displaystyle \int _{0}^{\phi =2\pi }\int _{0}^{\theta =\pi }2I\pi E\sin \theta d\theta d\phi =8\pi ^{2}IE=\oint dp_{\theta }dp_{\phi }d\theta d\phi }$,

and hence the number of states´´ in the energy range ${\displaystyle dE}$ is

${\displaystyle \Omega \propto }$ (phase space volume at ${\displaystyle E+dE}$) minus (phase space volume at ${\displaystyle E}$) is given by ${\displaystyle 8{\pi }^{2}IdE.}$

(b) Quantal a priori probability

Assuming that the number of quantum states in a range ${\displaystyle \Delta q\Delta p}$ for each direction of motion is given, per element, by a factor ${\displaystyle \Delta q\Delta p/h}$, the number of states in the energy range dE is, as seen under (a) ${\displaystyle 8\pi ^{2}IdE/h^{2}}$ for the rotating diatomic molecule. From wave mechanics it is known that the energy levels of a rotating diatomic molecule are given by

${\displaystyle E_{n}={\frac {n(n+1)h^{2}}{8\pi ^{2}I}},}$

each such level being (2n+1)-fold degenerate. By evaluating ${\displaystyle dn/dE_{n}=1/dE_{n}/dn}$ one obtains

${\displaystyle {\frac {dn}{dE_{n}}}={\frac {8\pi ^{2}I}{(2n+1)h^{2}}},\;\;\;(2n+1)dn={\frac {8\pi ^{2}I}{h^{2}}}dE_{n}.}$

Thus by comparison with ${\displaystyle \Omega }$ above, one finds that the approximate number of states in the range dE is given by

${\displaystyle \Sigma \propto (2n+1)dn.}$

In the case of the one-dimensional simple harmonic oscillator of natural frequency ${\displaystyle \nu }$ one finds correspondingly: (a) ${\displaystyle \Omega \propto dE/\nu }$, and (b) ${\displaystyle \Sigma \propto dn}$ (no degeneracy). Thus in quantum mechanics the a priori probability is effectively a measure of the degeneracy. In the case of the hydrogen atom or Coulomb potential (where the evaluation of the phase space volume for constant energy is more complicated) one knows that the quantum mechanical degeneracy is ${\displaystyle n^{2}}$ with ${\displaystyle E\propto 1/n^{2}}$. Thus in this case ${\displaystyle \Sigma \propto n^{2}dn}$.

## References

1. ^ Mood A.M., Graybill F.A., Boes D.C. (1974) Introduction to the Theory of Statistics (3rd Edition). McGraw-Hill. Section 2.2 (available online)
2. ^ E.g. Harold J. Price and Allison R. Manson, "Uninformative priors for Bayes’ theorem", AIP Conf. Proc. 617, 2001
3. ^ Eidenberger, Horst (2014), Categorization and Machine Learning: The Modeling of Human Understanding in Computers, Vienna University of Technology, p. 109, ISBN 9783735761903.
4. ^ Harald J.W. Müller-Kirsten, Basics of Statistical Physics, 2nd. ed. World Scientific (Singapore, 2013)
5. ^ A. Ben-Naim, Entropy Demystified, World Scientific (Singapore, 2007)