# Canonical ensemble

The canonical ensemble in statistical mechanics is a statistical ensemble representing a probability distribution of microscopic states of the system. For a system taking only discrete values of energy, the probability distribution is characterized by the probability $p_i$ of finding the system in a particular microscopic state $i$ with energy level $E_i$, conditioned on the prior knowledge that the total energy of the system and reservoir combined remains constant. This is given by the Boltzmann distribution,

$p_i = \tfrac{1}{Z}e^{-\frac{E_i}{kT}} = e^{-\frac{E_i -A}{kT}}$

where

$Z=e^{-\frac{A}{kT}}$

is the normalizing constant explained below (A is the Helmholtz free energy function). The Boltzmann distribution describes a system that can exchange energy with a heat bath (or alternatively with a large number of similar systems) so that its temperature remains constant. Equivalently, it is the distribution which has maximum entropy for a given average energy $\langle E \rangle$.

It is also referred to as the NVT ensemble: the number of particles $(N)$ and the volume $(V)$ of each system in the ensemble are constant and the ensemble has a well-defined temperature $(T)$, given by the temperature of the heat bath with which it would be in equilibrium.

The quantity $k$ is the Boltzmann constant, which relates the units of temperature to units of energy. It may be suppressed by expressing the absolute temperature using thermodynamic beta,

$\beta = \frac{1}{kT}$.

The quantities $A$ and $Z$ are constants for a particular ensemble, which ensure that $\Sigma p_i$ is normalised to 1. $Z$ is therefore given by

$Z = \sum_{i} e^{-\frac{E_i}{kT}} = \sum_{i} e^{-\beta E_i}$.

This is called the partition function of the canonical ensemble. Specifying this dependence of $Z$ on the energies $E_i$ conveys the same mathematical information as specifying the form of $p_i$ above.

The canonical ensemble (and its partition function) is widely used as a tool to calculate thermodynamic quantities of a system under a fixed temperature. This article derives some basic elements of the canonical ensemble. Other related thermodynamic formulas are given in the partition function article. When viewed in a more general setting, the canonical ensemble is known as the Gibbs measure, where, because it has the Markov property of statistical independence, it occurs in many settings outside of the field of physics.

## Deriving the Boltzmann factor from ensemble theory

Let $E_i\,$ be the energy of the microstate $i\,$ and suppose there are $n_i\,$ members of the ensemble residing in this state. Further we assume the total number of members in the ensemble, $\mathcal{N}\,$, and the total energy of all systems of the ensemble, $\mathcal{E}\,$, are fixed, i.e.,

$\mathcal{N}= \sum_i n_i , \,$
$\mathcal{E}= \sum_i n_i E_i \,.$

Since systems in the ensemble are indistinguishable with respect to a macrostate, for each set $\{n_i\} \,$, the number of ways of shuffling systems is equal to

$W (\{n_i\}) = \frac{\mathcal{N}!}{ \prod_{i} n_i!} .$

So for a given $\{n_i\}\,$, there are $W(\{n_i\})\,$ rearrangements that specify the same state of the ensemble.

The most probable distribution is the one that maximizes $W (\{n_i\})\,$. The probability for any other distribution to occur is extremely small in the limit $\mathcal{N} \rightarrow \infty \,$. To determine this distribution, one should maximize $W (\{n_i\})\,$ with respect to the $n_i\,$'s, under two constraints specified above. This can be done by using two Lagrange multipliers $\alpha \,$ and $\beta\,$. (The assumption that $\mathcal{N} \rightarrow \infty \,$ would be invoked in such calculation, which allows one to apply Stirling's approximation.) The result is

$n_i = e^{-\alpha -\beta E_i} \,$.

This distribution is called the canonical distribution. To determine $\alpha \,$ and $\beta\,$, it is useful to introduce the partition function as a sum over microscopic states

$Z(\beta) = \sum_j e^{-\beta E_j} .\,$

Comparing with thermodynamic formulae, it can be shown that $\beta\,$, is related to the absolute temperature $T\,$ as, $\beta=1/k_B T\,$. Moreover the expression

$F=- \frac{\ln Z(\beta)}{\beta}$

is identified as the Helmholtz free energy $F$. A derivation is given here. Consequently, from the partition function we can obtain the average thermodynamic quantities for the ensemble. For example, the average energy among members of the ensemble is

$\langle E \rangle = \frac{ \mathcal{E}}{ \mathcal{N} } = - \frac{\partial}{\partial \beta } \ln Z(\beta) \,$.

This relation can be used to determine $\beta\,$. $\alpha\,$ is determined from

$e^{\alpha} = \frac{ Z(\beta)}{ \mathcal{N}}$.

## A derivation from heat-bath viewpoint

Illustration of a system of interest suspended in a heat bath. The system of interest is taken to be small compared to the heat bath.

Define the following:

• S - the system of interest
• S′ - the heat reservoir in which S resides; S is small compared to S′
• S* - the system consisting of S and S′ combined together
• m - an indexing variable which labels all the available energy states of the system S
• Em - the energy of the state corresponding to the index m for the system S
• E′ - the energy associated with the heat bath
• E* - the energy associated with S*
• Ω′(E) - denotes the number of microstates available at a particular energy E for the heat reservoir.

It is assumed that the system S and the reservoir S′ are in thermal equilibrium. The objective is to calculate the set of probabilities pm that S is in a particular energy state Em.

Suppose S is in a microstate indexed by m. From the above definitions, the total energy of the system S* is given by

$E^\ast = E' + E_m \,$

Notice E* is constant, since the combined system S* is taken to be isolated.

Now, arguably the key step in the derivation is that the probability of S being in the m-th state, $\; p_m$, is proportional to the corresponding number of microstates available to the reservoir when S is in the m-th state. Therefore,

$p_m = C'\Omega'(E') \,$

for some constant $\; C'$. Taking the logarithm gives

$\ln p_m = \ln C' + \ln \Omega' (E') = \ln C' + \ln \Omega' (E^* - E_m) \,$

Since Em is small compared to E*, a Taylor series expansion can be performed on the latter logarithm around the energy E*. A good approximation can be obtained by keeping the first two terms of the Taylor series expansion:

$\ln \Omega'(E') = \sum_{k=0}^\infty \frac{(E' - E^\ast )^k }{k!} \frac{d^k \ln \Omega' (E^\ast)}{dE'^k} \approx \ln \Omega'(E^\ast) - \frac{d}{dE'} \ln \Omega'(E^\ast) E_m$

The following quantity is a constant which is traditionally denoted by β, known as the thermodynamic beta.

$\beta = \frac{d}{dE'} \ln \Omega'(E^\ast) = \left . \frac{d}{dE'} \ln \Omega'(E') \right |_{E'=E^\ast}$

Finally,

$\ln p_m = \ln C' + \ln \Omega'(E^\ast) - \beta E_m \,$

Exponentiating this expression gives

$p_m = C' \Omega'(E^\ast) e^{-\beta E_m}$

The factor in front of the exponential can be treated as a normalization constant C, where

$C = C' \Omega'(E^\ast) \,$

From this

$p_m = C e^{-\beta E_m} \,$

### Normalization to recover the partition function

Since probabilities must sum to 1, it must be the case that

$\sum_m p_m = 1 = \sum_m C e^{-\beta E_m} = C \sum_m e^{-\beta E_m} \iff C = \frac{1}{\sum_m e^{-\beta E_m}} \equiv \frac{1}{Z(\beta)}$

where $Z$ is known as the Partition function for the canonical ensemble.

### Note on derivation

As mentioned above, the derivation hinges on recognizing that the probability of the system being in a particular state is proportional to the corresponding multiplicities of the reservoir (the same can be said for the grand canonical ensemble). As long as one makes that observation, it is flexible as how one might proceed. In the derivation given, the logarithm is taken, then a linear approximation based on physical arguments is used. Alternatively, one can apply the thermodynamic identity for differential entropy:

$d S = {1 \over T} (d U + P d V - \mu d N)$

and obtain the same result. See the article on Maxwell-Boltzmann statistics where this approach is employed.

The canonical ensemble is also called the Gibbs ensemble, in honor of J.W. Gibbs, widely regarded with Boltzmann as being one of the two fathers of statistical mechanics. In his definitive 1901 book "Elementary Principles in Statistical Mechanics", Gibbs viewed an ensemble as a list of the allowed states of the system (each state appearing once and only once in the list) and the associated statistical weights. The states do not interact with each other, or with a reservoir, until Gibbs treats what happens when two complete ensembles at two different temperatures are allowed to interact weakly (Gibbs, pp 160). Gibbs writes that "...the distribution in phase..." (the phase space density in modern language) "...[is] called canonical...[if] the index of probability" (the logarithm of the statistical weight of the phase space density) "...is a linear function of the energy..." (Gibbs, Ch. 4). In Gibbs' formulation, this requirement (his equation 91), in modern notation

$P = e^{\frac{E-A}{kT} } \,$

is taken to define the canonical ensemble and to be the fundamental postulate. Gibbs does show that a large collection of interacting microcanonical systems approaches the canonical ensemble, but this is part of his demonstration (Gibbs, pp 169–183) that the principle of equal a priori probabilities, therefore the microcanonical ensemble, are inferior to the canonical ensemble as an axiomatization of statistical mechanics, at every point where the two treatments differ.

Gibbs original formulation is still standard in modern mathematically rigorous treatments of statistical mechanics, where the canonical ensemble is defined as the probability measure

$e^{ {E - A \over kT} } dp \, dq$

with p and q being the canonical coordinates.

### Characteristic state function

The characteristic state function of the canonical ensemble is the Helmholtz free energy function, as the following relationship holds:

$Z(T,V,N) = e^{- \beta A} \,\;$

## Quantum mechanical systems

By applying the canonical partition function, one can easily obtain the corresponding results for a canonical ensemble of quantum mechanical systems. A quantum mechanical ensemble in general is described by a density matrix. Suppose the Hamiltonian H of interest is a self adjoint operator with only discrete spectrum. The energy levels $\{ E_n \}$ are then the eigenvalues of H, corresponding to eigenvector $| \psi _n \rangle$. From the same considerations as in the classical case, the probability that a system from the ensemble will be in state $| \psi _n \rangle$ is $p_n = C e^{- \beta E_n}$, for some constant $C$. So the ensemble is described by the density matrix

$\rho = \sum p_n | \psi _n \rangle \langle \psi_n | = \sum C e^{- \beta E_n} | \psi _n \rangle \langle \psi_n|$

(Technical note: a density matrix must be trace-class, therefore we have also assumed that the sequence of energy eigenvalues diverges sufficiently fast.) A density operator is assumed to have trace 1, so

$\operatorname{Tr} (\rho) = C \underbrace{\sum e^{- \beta E_n}}_Q = 1,$

which means

$C = \frac{1}{\sum e^{- \beta E_n} } = \frac{1}{Q}.$

Q is the quantum-mechanical version of the canonical partition function. Putting C back into the equation for ρ gives

$\rho = \frac{1}{\sum e^{- \beta E_n}} \sum e^{- \beta E_n} | \psi _n \rangle \langle \psi_n| = \frac{1}{ \operatorname{Tr}( e^{- \beta H} ) } e^{- \beta H} .$

By the assumption that the energy eigenvalues diverge, the Hamiltonian H is an unbounded operator, therefore we have invoked the Borel functional calculus to exponentiate the Hamiltonian H. Alternatively, in non-rigorous fashion, one can consider that to be the exponential power series.

Notice the quantity

$\operatorname{Tr}( e^{- \beta H} )$

is the quantum mechanical counterpart of the canonical partition function, being the normalization factor for the mixed state of interest.

The density operator ρ obtained above therefore describes the (mixed) state of a canonical ensemble of quantum mechanical systems. As with any density operator, if A is a physical observable, then its expected value is

$\langle A \rangle = \operatorname{Tr}( \rho A ).$

## Issues in the traditional models of the derivation of the canonical distribution

Understanding and clear presentation of the derivation of the canonical distribution are difficult both for students and for teachers. The difficulty is caused by the complexity of the subject, but also by the circumstance that mathematical schemes allowing to receive a desirable result are confused with physical models.

These schemes consider only the eigenstates of the system to be system states. However, a great number of quantum superpositions of the system eigenstates corresponds to the same value of the energy of the system.

In one of the schemes the system S is considered to be a part of a huge system U, usually called “Universe”. The system environment W (or addition to system U) is often called “the thermostat”. The “Universe” is described by the microcanonical ensemble. It means that the "Universe" is in equilibrium, the energy of the "Universe" lies in a very small interval, only the eigenstates of the "Universe" are possible and all eigenstates are equiprobable.

Another scheme - the method of the most probable distribution - assumes that the "Universe" consists of a very great number of systems identical to the system under consideration. In both schemes the system interaction with its environment is considered extremely weak – to make it possible to talk about certain quantum state of the system S. At the same time, the transition of the system from one eigenstate to another is considered to be caused by its energy exchange with the environment. It is obvious that the canonical distribution can be used to calculate the observed quantities only if during the measurement the system has time to visit all states of the spectrum repeatedly. However, the aforementioned schemes do not correlate the values of the contact with the environment, spectral diapason of the system energy and measurement time.

The schemes used for the derivation of canonical distribution do not call into question the absolute accuracy of quantum mechanics. However, one must remember that both classical and quantum mechanics have resulted from the observation of systems with small number of objects. If the number of objects (e.g. particles) in a system is small and calculations are possible, the mechanics show amazing accuracy. One might assume that in systems with macroscopically great number of particles quantum mechanics would also be absolutely exact. However, this assumption contradicts the irreversibility of evolution of the macrosystems, the second law of thermodynamics and the experimental data received on concrete physical objects.

The experimental evidence of the existence of the probabilistic processes which are not described by the standard quantum formalism allows to consider the canonical distribution as a result of the averaging on various system states with the same energy of the squared modules of the coefficients of expansion of the system state function on the eigenfunctions. But the question arises of what exactly are the system states in light of the existence of the probabilistic processes which are not considered by the standard formalism of the quantum mechanics (in particular, recording the state function as a superposition of eigenfunctions of the whole macrosystem may not be quite adequate).

## Relations with other ensembles

A generalization of this is the grand canonical ensemble, in which the systems may share particles as well as energy. By contrast, in the microcanonical ensemble, the energy of each individual system is fixed.