# System size expansion

The system size expansion, also known as van Kampen's expansion or the Ω-expansion, is a technique pioneered by Nico van Kampen[1] used in the analysis of stochastic processes. Specifically, it allows one to find an approximation to the solution of a master equation with nonlinear transition rates. The leading order term of the expansion is given by the linear noise approximation, in which the master equation is approximated by a Fokker–Planck equation with linear coefficients determined by the transition rates and stoichiometry of the system.

Less formally, it is normally straightforward to write down a mathematical description of a system where processes happen randomly (for example, radioactive atoms randomly decay in a physical system, or genes that are expressed stochastically in a cell). However, these mathematical descriptions are often too difficult to solve for the study of the systems statistics (for example, the mean and variance of the number of atoms or proteins as a function of time). The system size expansion allows one to obtain an approximate statistical description that can be solved much more easily than the master equation.

## Preliminaries

Systems that admit a treatment with the system size expansion may be described by a probability distribution ${\displaystyle P(X,t)}$, giving the probability of observing the system in state ${\displaystyle X}$ at time ${\displaystyle t}$. ${\displaystyle X}$ may be, for example, a vector with elements corresponding to the number of molecules of different chemical species in a system. In a system of size ${\displaystyle \Omega }$ (intuitively interpreted as the volume), we will adopt the following nomenclature: ${\displaystyle \mathbf {X} }$ is a vector of macroscopic copy numbers, ${\displaystyle \mathbf {x} =\mathbf {X} /\Omega }$ is a vector of concentrations, and ${\displaystyle \mathbf {\phi } }$ is a vector of deterministic concentrations, as they would appear according to the rate equation in an infinite system. ${\displaystyle \mathbf {x} }$ and ${\displaystyle \mathbf {X} }$ are thus quantities subject to stochastic effects.

A master equation describes the time evolution of this probability.[1] Henceforth, a system of chemical reactions[2] will be discussed to provide a concrete example, although the nomenclature of "species" and "reactions" is generalisable. A system involving ${\displaystyle N}$ species and ${\displaystyle R}$ reactions can be described with the master equation:

${\displaystyle {\frac {dP(\mathbf {X} ,t)}{dt}}=\Omega \sum _{j=1}^{R}\left(\prod _{i=1}^{N}\mathbb {E} ^{-S_{ij}}-1\right)f_{j}(\mathbf {x} ,\Omega )P(\mathbf {X} ,t).}$

Here, ${\displaystyle \Omega }$ is the system size, ${\displaystyle \mathbb {E} }$ is an operator which will be addressed later, ${\displaystyle S_{ij}}$ is the stoichiometric matrix for the system (in which element ${\displaystyle S_{ij}}$ gives the stoichiometric coefficient for species ${\displaystyle i}$ in reaction ${\displaystyle j}$), and ${\displaystyle f_{j}}$ is the rate of reaction ${\displaystyle j}$ given a state ${\displaystyle \mathbf {x} }$ and system size ${\displaystyle \Omega }$.

${\displaystyle \mathbb {E} ^{-S_{ij}}}$ is a step operator,[1] removing ${\displaystyle S_{ij}}$ from the ${\displaystyle i}$th element of its argument. For example, ${\displaystyle \mathbb {E} ^{-S_{23}}f(x_{1},x_{2},x_{3})=f(x_{1},x_{2}-S_{23},x_{3})}$. This formalism will be useful later.

The above equation can be interpreted as follows. The initial sum on the RHS is over all reactions. For each reaction ${\displaystyle j}$, the brackets immediately following the sum give two terms. The term with the simple coefficient −1 gives the probability flux away from a given state ${\displaystyle \mathbf {X} }$ due to reaction ${\displaystyle j}$ changing the state. The term preceded by the product of step operators gives the probability flux due to reaction ${\displaystyle j}$ changing a different state ${\displaystyle \mathbf {X'} }$ into state ${\displaystyle \mathbf {X} }$. The product of step operators constructs this state ${\displaystyle \mathbf {X'} }$.

### Example

For example, consider the (linear) chemical system involving two chemical species ${\displaystyle X_{1}}$ and ${\displaystyle X_{2}}$ and the reaction ${\displaystyle X_{1}\rightarrow X_{2}}$. In this system, ${\displaystyle N=2}$ (species), ${\displaystyle R=1}$ (reactions). A state of the system is a vector ${\displaystyle \mathbf {X} =\{n_{1},n_{2}\}}$, where ${\displaystyle n_{1},n_{2}}$ are the number of molecules of ${\displaystyle X_{1}}$ and ${\displaystyle X_{2}}$ respectively. Let ${\displaystyle f_{1}(\mathbf {x} ,\Omega )={\frac {n_{1}}{\Omega }}=x_{1}}$, so that the rate of reaction 1 (the only reaction) depends on the concentration of ${\displaystyle X_{1}}$. The stoichiometry matrix is ${\displaystyle (-1,1)^{T}}$.

{\displaystyle {\begin{aligned}{\frac {dP(\mathbf {X} ,t)}{dt}}&=\Omega \left(\mathbb {E} ^{-S_{11}}\mathbb {E} ^{-S_{21}}-1\right)f_{1}\left({\frac {\mathbf {X} }{\Omega }}\right)P(\mathbf {X} ,t)\\&=\Omega \left(f_{1}\left({\frac {\mathbf {X} +\mathbf {\Delta X} }{\Omega }}\right)P\left(\mathbf {X} +\mathbf {\Delta X} ,t\right)-f_{1}\left({\frac {\mathbf {X} }{\Omega }}\right)P\left(\mathbf {X} ,t\right)\right),\end{aligned}}}

where ${\displaystyle \mathbf {\Delta X} =\{1,-1\}}$ is the shift caused by the action of the product of step operators, required to change state ${\displaystyle \mathbf {X} }$ to a precursor state ${\displaystyle \mathbf {X} '}$.

## Linear noise approximation

If the master equation possesses nonlinear transition rates, it may be impossible to solve it analytically. The system size expansion utilises the ansatz that the variance of the steady-state probability distribution of constituent numbers in a population scales like the system size. This ansatz is used to expand the master equation in terms of a small parameter given by the inverse system size.

Specifically, let us write the ${\displaystyle X_{i}}$, the copy number of component ${\displaystyle i}$, as a sum of its "deterministic" value (a scaled-up concentration) and a random variable ${\displaystyle \xi }$, scaled by ${\displaystyle \Omega ^{1/2}}$:

${\displaystyle X_{i}=\Omega \phi _{i}+\Omega ^{1/2}\xi _{i}.}$

The probability distribution of ${\displaystyle \mathbf {X} }$ can then be rewritten in the vector of random variables ${\displaystyle \xi }$:

${\displaystyle P(\mathbf {X} ,t)=P(\Omega \mathbf {\phi } +\Omega ^{1/2}\mathbf {\xi } )=\Pi (\mathbf {\xi } ,t).}$

Let us consider how to write reaction rates ${\displaystyle f}$ and the step operator ${\displaystyle \mathbb {E} }$ in terms of this new random variable. Taylor expansion of the transition rates gives:

${\displaystyle f_{j}(\mathbf {x} )=f_{j}(\mathbf {\phi } +\Omega ^{-1/2}\mathbf {\xi } )=f_{j}(\mathbf {\phi } )+\Omega ^{-1/2}\sum _{i=1}^{N}{\frac {\partial f_{j}(\mathbf {\phi } )}{\partial \phi _{i}}}\xi _{i}+O(\Omega ^{-1}).}$

The step operator has the effect ${\displaystyle \mathbb {E} f(n)\rightarrow f(n+1)}$ and hence ${\displaystyle \mathbb {E} f(\xi )\rightarrow f(\xi +\Omega ^{-1/2})}$:

${\displaystyle \prod _{i=1}^{N}\mathbb {E} ^{-S_{ij}}\simeq 1-\Omega ^{-1/2}\sum _{i}S_{ij}{\frac {\partial }{\partial \xi _{i}}}+{\frac {\Omega ^{-1}}{2}}\sum _{i}\sum _{k}S_{ij}S_{kj}{\frac {\partial ^{2}}{\partial \xi _{i}\,\partial \xi _{k}}}+O(\Omega ^{-3/2}).}$

We are now in a position to recast the master equation.

{\displaystyle {\begin{aligned}&{}\quad {\frac {\partial \Pi (\mathbf {\xi } ,t)}{\partial t}}-\Omega ^{1/2}\sum _{i=1}^{N}{\frac {\partial \phi _{i}}{\partial t}}{\frac {\partial \Pi (\mathbf {\xi } ,t)}{\partial \xi _{i}}}\\&=\Omega \sum _{j=1}^{R}\left(-\Omega ^{-1/2}\sum _{i}S_{ij}{\frac {\partial }{\partial \xi _{i}}}+{\frac {\Omega ^{-1}}{2}}\sum _{i}\sum _{k}S_{ij}S_{kj}{\frac {\partial ^{2}}{\partial \xi _{i}\,\partial \xi _{k}}}+O(\Omega ^{-3/2})\right)\\&{}\qquad \times \left(f_{j}(\mathbf {\phi } )+\Omega ^{-1/2}\sum _{i}{\frac {\partial f_{j}(\mathbf {\phi } )}{\partial \phi _{i}}}\xi _{i}+O(\Omega ^{-1})\right)\Pi (\mathbf {\xi } ,t).\end{aligned}}}

This rather frightening expression makes a bit more sense when we gather terms in different powers of ${\displaystyle \Omega }$. First, terms of order ${\displaystyle \Omega ^{1/2}}$ give

${\displaystyle \sum _{i=1}^{N}{\frac {\partial \phi _{i}}{\partial t}}{\frac {\partial \Pi (\mathbf {\xi } ,t)}{\partial \xi _{i}}}=\sum _{i=1}^{N}\sum _{j=1}^{R}S_{ij}f_{j}(\mathbf {\phi } ){\frac {\partial \Pi (\mathbf {\xi } ,t)}{\partial \xi _{j}}}.}$

These terms cancel, due to the macroscopic reaction equation

${\displaystyle {\frac {\partial \phi _{i}}{\partial t}}=\sum _{j=1}^{R}S_{ij}f_{j}(\mathbf {\phi } ).}$

The terms of order ${\displaystyle \Omega ^{0}}$ are more interesting:

${\displaystyle {\frac {\partial \Pi (\mathbf {\xi } ,t)}{\partial t}}=\sum _{j}\left(\sum _{ik}-S_{ij}{\frac {\partial f_{j}}{\partial \phi _{k}}}{\frac {\partial (\xi _{k}\Pi (\mathbf {\xi } ,t))}{\partial \xi _{i}}}+{\frac {1}{2}}f_{j}\sum _{ik}S_{ij}S_{kj}{\frac {\partial ^{2}\Pi (\mathbf {\xi } ,t)}{\partial \xi _{i}\,\partial \xi _{k}}}\right),}$

which can be written as

${\displaystyle {\frac {\partial \Pi (\mathbf {\xi } ,t)}{\partial t}}=-\sum _{ik}A_{ik}{\frac {\partial (\xi _{k}\Pi )}{\partial \xi _{i}}}+{\frac {1}{2}}\sum _{ik}[\mathbf {BB} ^{T}]_{ik}{\frac {\partial ^{2}\Pi }{\partial \xi _{i}\,\partial \xi _{k}}},}$

where

${\displaystyle A_{ik}=\sum _{j=1}^{R}S_{ij}{\frac {\partial f_{j}}{\partial \phi _{k}}}={\frac {\partial (\mathbf {S} _{i}\cdot \mathbf {f} )}{\partial \phi _{k}}},}$

and

${\displaystyle [\mathbf {BB} ^{T}]_{ik}=\sum _{j=1}^{R}S_{ij}S_{kj}f_{j}(\mathbf {\phi } )=[\mathbf {S} \,{\mbox{diag}}(f(\mathbf {\phi } ))\,\mathbf {S} ^{T}]_{ik}.}$

The time evolution of ${\displaystyle \Pi }$ is then governed by the linear Fokker–Planck equation with coefficient matrices ${\displaystyle \mathbf {A} }$ and ${\displaystyle \mathbf {BB} ^{T}}$ (in the large-${\displaystyle \Omega }$ limit, terms of ${\displaystyle O(\Omega ^{-1/2})}$ may be neglected, termed the linear noise approximation). With knowledge of the reaction rates ${\displaystyle \mathbf {f} }$ and stoichiometry ${\displaystyle S}$, the moments of ${\displaystyle \Pi }$ can then be calculated.

## Software

The linear noise approximation has become a popular technique for estimating the size of intrinsic noise in terms of coefficients of variation and Fano factors for molecular species in intracellular pathways. The second moment obtained from the linear noise approximation (on which the noise measures are based) are exact only if the pathway is composed of first-order reactions. However bimolecular reactions such as enzyme-substrate, protein-protein and protein-DNA interactions are ubiquitous elements of all known pathways; for such cases, the linear noise approximation can give estimates which are accurate in the limit of large reaction volumes. Since this limit is taken at constant concentrations, it follows that the linear noise approximation gives accurate results in the limit of large molecule numbers and becomes less reliable for pathways characterized by many species with low copy numbers of molecules.

A number of studies have elucidated cases of the insufficiency of the linear noise approximation in biological contexts by comparison of its predictions with those of stochastic simulations.[3][4] This has led to the investigation of higher order terms of the system size expansion that go beyond the linear approximation. These terms have been used to obtain more accurate moment estimates for the mean concentrations and for the variances of the concentration fluctuations in intracellular pathways. In particular, the leading order corrections to the linear noise approximation yield corrections of the conventional rate equations.[5] Terms of higher order have also been used to obtain corrections to the variances and covariances estimates of the linear noise approximation.[6][7] The linear noise approximation and corrections to it can be computed using the open source software intrinsic Noise Analyzer. The corrections have been shown to be particularly considerable for allosteric and non-allosteric enzyme-mediated reactions in intracellular compartments.

## References

1. ^ a b c van Kampen, N. G. (2007) "Stochastic Processes in Physics and Chemistry", North-Holland Personal Library
2. ^ Elf, J. and Ehrenberg, M. (2003) "Fast Evaluation of Fluctuations in Biochemical Networks With the Linear Noise Approximation", Genome Research, 13:2475–2484.
3. ^ Hayot, F. and Jayaprakash, C. (2004), "The linear noise approximation for molecular fluctuations within cells", Physical Biology, 1:205
4. ^ Ferm, L. Lötstedt, P. and Hellander, A. (2008), "A Hierarchy of Approximations of the Master Equation Scaled by a Size Parameter", Journal of Scientific Computing, 34:127
5. ^ Grima, R. (2010) "An effective rate equation approach to reaction kinetics in small volumes: Theory and application to biochemical reactions in nonequilibrium steady-state conditions", The Journal of Chemical Physics, 132:035101
6. ^ Grima, R. and Thomas, P. and Straube, A.V. (2011), "How accurate are the nonlinear chemical Fokker-Planck and chemical Langevin equations?", The Journal of Chemical Physics, 135:084103
7. ^ Grima, R. (2012), "A study of the accuracy of moment-closure approximations for stochastic chemical kinetics", The Journal of Chemical Physics, 136: 154105