Continuous-time Markov chain

In probability theory, a continuous-time Markov chain (CTMC^[1] or continuous-time Markov process^[2]) is a mathematical model which takes values in some finite or countable set and for which the time spent in each state takes non-negative real values and has an exponential distribution. It is a continuous-time stochastic process with the Markov property which means that future behaviour of the model (both remaining time in current state and next state) depends only on the current state of the model and not on historical behaviour. The model is a continuous-time version of the Markov chain model, named because the output from such a process is a sequence (or chain) of states.

Definitions

A continuous-time Markov chain (X_t)_t ≥ 0 is defined by a finite or countable state space S, a transition rate matrix Q with dimensions equal to that of the state space and initial probability distribution defined on the state space. For i ≠ j, the elements q_ij are non-negative and describe the rate the process transitions from state i to state j. The elements q_ii are chosen such that each row of the transition rate matrix sums to zero.

There are three equivalent definitions of the process.^[3]

Infinitesimal definition

Let X_t be the random variable describing the state of the process at time t, and assume that the process is in a state i at time t. Then X_t + h is independent of previous values (X_s : s≤ t) and as h → 0 uniformly in t for all j

\Pr(X(t+h)=j|X(t)=i)=\delta _{ij}+q_{ij}h+o(h)

using little-o notation. The q_ij can be seen as measuring how quickly the transition from i to j happens

Jump chain/holding time definition

Define a discrete-time Markov chain Y_n to describe the nth jump of the process and variables S₁, S₂, S₃, ... to describe holding times in each of the states where the distribution of S_i is given by −q_{Y_iY_i}.

Transition probability definition

For any value n = 0, 1, 2, 3, ... and times indexed up to this value of n: t₀, t₁, t₂, ... and all states recorded at these times i₀, i₁, i₂, i₃, ... it holds that

\Pr(X_{t_{n+1}}=i_{n+1}|X_{t_{0}}=i_{0},X_{t_{1}}=i_{1},\ldots ,X_{t_{n}}=i_{n})=p_{i_{n}i_{n+1}}(t_{n+1}-t_{n})

where p_ij is the solution of the forward equation (a first-order differential equation)

P'(t)=P(t)Q

with initial condition P(0) is the identity matrix.

Properties

Irreducibility

The state space S can be partitioned into communicating classes. i and j are said to communicate (and therefore be in the same communicating class) if it is possible to get to state j from state i, that is if

\operatorname {Pr} _{i}(X_{t}=j{\text{ for some }}t\geq 0)>0.

A CTMC is irreducible if there is a single communicating class.^[3]^[4]

Recurrence and transience

A state i is recurrent if, starting in state i, the probability the process returns unboundedly many times to the state is 1, that is^[5]

\operatorname {Pr} _{i}(\{t\geq 0:X_{t}=i\}{\text{ is unbounded}})=1

and a state i transient if this quantity has probability zero,^[5]

\operatorname {Pr} _{i}(\{t\geq 0:X_{t}=i\}{\text{ is unbounded}})=0.

If the expected return time (the time starting in state i until the next visit to state i) is finite the state is positive recurrent, otherwise it is null recurrent.

Transient behaviour

Write P(t) for the matrix with entries p_ij = P(X_t = j | X₀ = i). Then the matrix P(t) satisfies the forward equation, a first-order differential equation

P'(t)=P(t)Q

where the prime denotes differentiation with respect to t. The solution to this equation is given by a matrix exponential

P(t)=e^{tQ}

In a simple case such as a CTMC on the state space {1,2}. The general Q matrix for such a process is the following 2 × 2 matrix with α,β > 0

Q={\begin{pmatrix}-\alpha &\alpha \\\beta &-\beta \end{pmatrix}}.

The above relation for forward matrix can be solved explicitly in this case to give

P(t)={\begin{pmatrix}{\frac {\beta }{\alpha +\beta }}+{\frac {\alpha }{\alpha +\beta }}e^{-(\alpha +\beta )t}&{\frac {\alpha }{\alpha +\beta }}-{\frac {\alpha }{\alpha +\beta }}e^{-(\alpha +\beta )t}\\{\frac {\beta }{\alpha +\beta }}-{\frac {\beta }{\alpha +\beta }}e^{-(\alpha +\beta )t}&{\frac {\alpha }{\alpha +\beta }}+{\frac {\beta }{\alpha +\beta }}e^{-(\alpha +\beta )t}\end{pmatrix}}

However, direct solutions are complicated to compute for larger matrices. The fact that Q is the generator for a semigroup of matrices

P(t+s)=e^{Q(t+s)}=e^{tQ}e^{sQ}=P(t)P(s)

is used.

Stationary distribution

The stationary distribution for an irreducible recurrent CTMC is the probability distribution to which the process converges for large values of t. Observe that for the two-state process considered earlier with P(t) given by

P(t)={\begin{pmatrix}{\frac {\beta }{\alpha +\beta }}+{\frac {\alpha }{\alpha +\beta }}e^{-(\alpha +\beta )t}&{\frac {\alpha }{\alpha +\beta }}-{\frac {\alpha }{\alpha +\beta }}e^{-(\alpha +\beta )t}\\{\frac {\beta }{\alpha +\beta }}-{\frac {\beta }{\alpha +\beta }}e^{-(\alpha +\beta )t}&{\frac {\alpha }{\alpha +\beta }}+{\frac {\beta }{\alpha +\beta }}e^{-(\alpha +\beta )t}\end{pmatrix}}

as t → ∞ the distribution tends to

P_{\pi }={\begin{pmatrix}{\frac {\beta }{\alpha +\beta }}&{\frac {\alpha }{\alpha +\beta }}\\{\frac {\beta }{\alpha +\beta }}&{\frac {\alpha }{\alpha +\beta }}\end{pmatrix}}

Observe that each row has the same distribution as this does not depend on starting state. The row vector π may be found by solving^[5]

\pi Q=0.

with the additional constraint that

\sum _{i\in S}\pi _{i}=1.

Example

The image to the right describes a continuous-time Markov chain with state-space {Bull market, Bear market, Stagnant market} and transition rate matrix

Q={\begin{pmatrix}-0.025&0.02&0.005\\0.3&-0.5&0.2\\0.02&0.4&-0.42\end{pmatrix}}.

The stationary distribution of this chain can be found by solving π Q = 0 subject to the contraint that elements must sum to 1 to obtain

\pi ={\begin{pmatrix}0.885&0.071&0.044\end{pmatrix}}.

Hitting times

The hitting time is the time, starting in a given set of states until the chain arrives in a given state or set of states. The distribution of such a time period has a phase type distribution. The simplest such distribution is that of a single exponentially distributed transition.

Expected hitting times

For a subset of states A ⊆ S, the vector k^A of hitting times (where element k^A_i represents the expected value, starting in state i that the chain enters one of the states in the set A) is the minimal non-negative solution to^[5]

{\begin{aligned}k_{i}^{A}=0&{\text{ for }}i\in A\\-\sum _{j\in S}q_{ij}k_{j}^{A}=1&{\text{ for }}i\notin A.\end{aligned}}

Time reversal

For a CTMC X_t, the time-reversed process is defined to be $\scriptstyle {\hat {X}}_{t}=X_{T-t}$ . By Kelly's lemma this process has the same stationary distribution as the forward process.

A chain is said to be reversible if the reversed process is the same as the forward process. Kolmogorov's criterion states that the necessary and sufficient condition for a process to be reversible is that the product of transition rates around a closed loop must be the same in both directions.

Embedded Markov chain

One method of finding the stationary probability distribution, π, of an ergodic continuous-time Markov chain, Q, is by first finding its embedded Markov chain (EMC). Strictly speaking, the EMC is a regular discrete-time Markov chain, sometimes referred to as a jump process. Each element of the one-step transition probability matrix of the EMC, S, is denoted by s_ij, and represents the conditional probability of transitioning from state i into state j. These conditional probabilities may be found by

s_{ij}={\begin{cases}{\frac {q_{ij}}{\sum _{k\neq i}q_{ik}}}&{\text{if }}i\neq j\\0&{\text{otherwise}}.\end{cases}}

From this, S may be written as

S=I-\left(\operatorname {diag} (Q)\right)^{-1}Q

where I is the identity matrix and diag(Q) is the diagonal matrix formed by selecting the main diagonal from the matrix Q and setting all other elements to zero.

To find the stationary probability distribution vector, we must next find $\phi$ such that

\phi S=\phi ,\,

with $\phi$ being a row vector, such that all elements in $\phi$ are greater than 0 and $||\phi ||_{1}$ = 1. From this, π may be found as

\pi ={-\phi (\operatorname {diag} (Q))^{-1} \over \left\|\phi (\operatorname {diag} (Q))^{-1}\right\|_{1}}.

Note that S may be periodic, even if Q is not. Once π is found, it must be normalized to a unit vector.

Another discrete-time process that may be derived from a continuous-time Markov chain is a δ-skeleton—the (discrete-time) Markov chain formed by observing X(t) at intervals of δ units of time. The random variables X(0), X(δ), X(2δ), ... give the sequence of states visited by the δ-skeleton.

Applications

Markov chains are used to describe physical processes where a system evolves in constant time. Sometimes, rather than a single systems, they are applied to an ensemble of identical, independent systems, and the probabilities are used to find how many members of the ensemble are in a given state. A master equation treatment is often used to analyse systems that evolve as Markov chains ,^{[citation needed]} with approximations possible for complicated systems .^{[citation needed]}

Chemical reactions

Imagine a large number n of molecules in solution in state A, each of which can undergo a chemical reaction to state B with a certain average rate. Perhaps the molecule is an enzyme, and the states refer to how it is folded. The state of any single enzyme follows a Markov chains, and since the molecules are essentially independent of each other, the number of molecules in state A or B at a time is n times the probability a given molecule is in that state.

Queueing theory

Numerous queueing models use continuous-time Markov chains. For example, an M/M/1 queue is a CTMC on the non-negative integers where upward transitions from i to i + 1 occur at rate λ according to a Poisson process and describe job arrivals, while transitions from i to i – 1 (for i > 1) occur at rate μ (job service times are exponentially distributed) and describe completed services (departures) from the queue.

Extensions

A time dependent (time heterogeneous) CTMC is as above, but with the transition rate matrix a function of time Q(t).

References

^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1007/3-540-48320-9_12, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1007/3-540-48320-9_12 instead.
^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1287/opre.27.3.616, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1287/opre.27.3.616 instead.
^ ^a ^b Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1017/CBO9780511810633.004, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1017/CBO9780511810633.004 instead.
^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1017/CBO9780511810633.003, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1017/CBO9780511810633.003 instead.
^ ^a ^b ^c ^d Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1017/CBO9780511810633.005, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1017/CBO9780511810633.005 instead.

William J. Stewart (1994). Introduction to the Numerical Solution of Markov Chains. Princeton University Press. pp. 17–23. ISBN 0-691-03699-3.

[1] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1007/3-540-48320-9_12, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1007/3-540-48320-9_12 instead.

[2] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1287/opre.27.3.616, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1287/opre.27.3.616 instead.

[norris1-3] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1017/CBO9780511810633.004, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1017/CBO9780511810633.004 instead.

[4] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1017/CBO9780511810633.003, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1017/CBO9780511810633.003 instead.

[norris2-5] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1017/CBO9780511810633.005, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1017/CBO9780511810633.005 instead.

[1]

[2]

[3]

[4]

[5]

v t e Queueing theory
Single queueing nodes	D/M/1 queue M/D/1 queue M/D/c queue M/M/1 queue Burke's theorem M/M/c queue M/M/∞ queue M/G/1 queue Pollaczek–Khinchine formula Matrix analytic method M/G/k queue G/M/1 queue G/G/1 queue Kingman's formula Lindley equation Fork–join queue Bulk queue
Arrival processes	Poisson point process Markovian arrival process Rational arrival process
Queueing networks	Jackson network Traffic equations Gordon–Newell theorem Mean value analysis Buzen's algorithm Kelly network G-network BCMP network
Service policies	FIFO LIFO Processor sharing Round-robin Shortest job next Shortest remaining time
Key concepts	Continuous-time Markov chain Kendall's notation Little's law Product-form solution Balance equation Quasireversibility Flow-equivalent server method Arrival theorem Decomposition method Beneš method
Limit theorems	Fluid limit Mean-field theory Heavy traffic approximation Reflected Brownian motion
Extensions	Fluid queue Layered queueing network Polling system Adversarial queueing network Loss network Retrial queue
Information systems	Data buffer Erlang (unit) Erlang distribution Flow control (data) Message queue Network congestion Network scheduler Pipeline (software) Quality of service Scheduling (computing) Teletraffic engineering
Category