# Talk:Continuous-time Markov chain

WikiProject Statistics (Rated C-class, Mid-importance)

This redirect is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C  This redirect does not require a rating on the quality scale.
Mid  This redirect has been rated as Mid-importance on the importance scale.

WikiProject Mathematics (Rated C-class, Mid-priority)
This redirect is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 C Class
 Mid Priority
Field: Probability and statistics

## Terminology

Undo the name change, see this page: http://www.encyclopediaofmath.org/index.php/Markov_process — Preceding unsigned comment added by 213.103.216.40 (talk) 06:07, 6 April 2013 (UTC)

Why should the state space be discrete?? Sodin 01:50, 4 March 2007 (UTC)

It shouldn't. I've changed it. I think some people may reserve the word "chain" for processes with discrete state spaces and this article may at some point have been moved from "continuous-time Markov chain". (Check the edit history, maybe?) Michael Hardy 03:25, 4 March 2007 (UTC)
This is incorrect. Markov chains can have a non-discrete state space. However the word "chain" is often reserved for discrete time! A search in google scholar of "continuous time markov chain" yields one third of the result compared to "continuous time markov process". The article should definitely be renamed! The name chain does not make sense for something that moves in continuous time on a contiuous space. To call a poisson-process a chain would be ok but a brownian motion is not a chain of events.

The state space is still assumed to be discrete in the article. A Markov process on a discrete space is generally referred to as a Markov chain. The more general definition should be given here. --129.7.128.30 (talk) 02:01, 5 January 2010 (UTC)
This is something that needs to be sorted out properly. The Markov chain article is presently (except possibly in the lead) only about discrete-time and discrete-space. The present version of this article is only about continuous-time discrete-space. And the present version of Markov process only has definitions suitable for discrete-space; similarly for Markov property. A possibility is to rename the present version of this article to sometime like "continuous time Markov chain" or "Markov chain (continuous time)" and to expand Markov process to properly cover continous-(multivariate)-space. Or Markov process could be a simple overview with pointers to Markov chain , "continuous time Markov chain" (renamed from here) and to an entirely new version of "Continuous-time Markov process". Melcombe (talk) 10:48, 5 January 2010 (UTC)

It seems to me that the terms "jump process" and "embedded Markov chain" relate to the same things; maybe the corresponding paragraphs should be merged? 140.78.107.99 13:48, 25 April 2007 (UTC)

Good catch. When I wrote the EMC section, I hadn't heard that referred to as a "jump process" before, but from a quick search of papers, they look like they're used at least somewhat interchangeably, so I merged the two sections. I'm not familiar with the recent literature or subtleties between these two terms, so if anyone knows of any subtle differences between EMC and jump process, please edit accordingly. I see that jump process refers to a finance term, but it doesn't sound like it's the same as what is discussed in this article.Halcyonhazard 17:01, 26 April 2007 (UTC)

## citations

It would be nice to have citations for this page. 67.233.152.184 (talk) 20:51, 9 March 2008 (UTC)Michal

I dont know definition of conservative Q matrix, but the equation seems to be wrong, because it implies negative probabilties. maybe there is missing 1 and it should look like ${\displaystyle q_{ii}=1-\sum _{j\neq i}q_{ij},}$ instead of ${\displaystyle q_{ii}=-q_{i}=-\sum _{j\neq i}q_{ij}}$? --Stepan Roucka (talk) 15:23, 17 August 2008 (UTC)

The diagonal of the Q matrix is supposed to be negative for CTMCs, and the formula you were questioning is, in fact, correct in the article. See [1] for a few examples. The elements in CTMCs are rates, not probabilities, and the diagonal elements are sort of an inflow to balance the outflow (maybe not the best way to think about it, but it's one analogy). Halcyonhazard (talk) 23:04, 17 August 2008 (UTC)
This line has been changed back to ${\displaystyle q_{ii}=\sum _{j\neq i}q_{ij}.}$. there's a comment in the main text explaining "no minus sign here if 1-qh in conditional prob above, as the q must be positive", which I believe to be correct (see also later where there's ${\displaystyle e^{-q_{ii}r}}$). however this convention isn't a very practical one. It's very useful to have a transition matrix s.t. ${\displaystyle p(t+h)=e^{Qh}p(t)}$, which is only valid if ${\displaystyle q_{ii}=-\sum _{j\neq i}q_{ij}}$; this is the convention I've seen in recent papers I've read on markov processes. Is there a good reason to stick with the convention used here? Flies 1 (talk) 15:22, 7 April 2010 (UTC)
There is no good reason (as discussed below) to stick to the present convention, except the need to have a good source for inline citations for the various results that are quoted. A bald "reference" to a complete book at the end of an article is hardly helpful to anyone hoping to check particular facts ... let's have at least a section number, but better a page(range) for each definition and important result. But rather than just using a single notation, I think there are books that switch from the present defintion for scalar rates to switch the sign when needed for a "matrix". And also recall the need to make things consistent for other articles that use transition rate and transition matrix. 193.62.153.194 (talk) 16:11, 7 April 2010 (UTC)

The one inline citation given uses the sign convention as in the start of this article and has formulae agreeing with some of those here (at present), but a lot of the others don't appear. Some edits and reversion indicate that there may be more than one convention around, and it may be that some formulae need to use the other convention or a different notation. Is someone able to give a proper citation for a source (or sources) that definitely do use a single convention for defining the required rates and matrices? I did hide some text that looked definitely wrong with the convention stated. Melcombe (talk) 10:45, 20 January 2010 (UTC)

When I initially contributed to this article, I used the Stewart reference (currently at the bottom). If I recall correctly, he consistently uses a single convention for CTMCs throughout. Note that the convention for DTMCs and CTMCs is different by definition (at least as far as I am aware). Halcyonhazard (talk) 05:24, 21 January 2010 (UTC)
There are two problems: (i) to get a consistent formulation here for "Continuous time Markov process", and (ii) to provide definition, used in other articles and linked through to here, for "transition rate" and "transition rate matrix" where these would need to be of the "usual" form. It would be good to give several different citation that agree on a defintion for these things (or possibly introduce an auxiliary matrix with negative values on the diagonal). Melcombe (talk) 10:00, 21 January 2010 (UTC)
Somewhat in line with the original concern in this thread I have a concern about the adding of all transition probabilities to give one. I do not have the original Parzen reference cited here, but it occurs pretty obvious to me that
${\displaystyle \Pr(X(t+h)=j|X(t)=i)=q_{ij}h+o(h)}$
Should have an additional Kronecker delta:
${\displaystyle \Pr(X(t+h)=j|X(t)=i)=\delta _{ij}+q_{ij}h+o(h),\,}$
such that
${\displaystyle \Pr(X(t+h)=i|X(t)=i)=1-\sum _{i\neq j}q_{ij}h+o(h),\,}$
In order to have the probablities add up to 1
${\displaystyle \sum _{j}\Pr(X(t+h)=j|X(t)=i)=1}$
Is that not correct, or am I just very tired and in an unfamiliar domain?
I also personally find it notationally confusing to use h as a time step. Would ${\displaystyle \Delta t}$ or a differential ${\displaystyle dt}$ not be easier to understand, or is that just because I am a physicist? --Slaunger (talk) 22:14, 30 June 2010 (UTC)
Maybe someone was tacitly assuming i ≠ j. Such assumptions should not be tacit. Michael Hardy (talk) 04:49, 1 July 2010 (UTC)
Probably yes. I have now explicitly added ${\displaystyle i\neq j}$ after the formula to make it explicit. --Slaunger (talk) 12:00, 1 July 2010 (UTC)
I'm not exactly in OR either, but my research touches on it on occasion. I don't tend to see much use of the ${\displaystyle \Delta t}$ notation, but rather a separate variable instead. The only notable exceptions that springs to mind are econophysics and computational finance (also tangentially related to my work) -- I've seen both notations used there, which may be due to the number of physicists in the field. I lean toward keeping the notation as-is because using ${\displaystyle \Delta t}$ seems like too much abuse of notation for a math article, especially because ${\displaystyle t+\Delta t}$ in this case represents two different variables, not a difference equation. Also, to the best of my knowledge, Michael Hardy was correct about the missing assumption. Halcyonhazard (talk) 04:12, 2 July 2010 (UTC)
I guess my notational preference for ${\displaystyle \Delta t}$ or dt is just because I am a physicist then :-). As I understand the theory, it does not have to be continuous with time. It could be continuous with some other variable. On the other hand the name of the article is Continuous-time Markov process, and it is explicitly mentioned that h is a small step in time. A little later, when writing down the differential equation, we also meet ${\displaystyle \partial t}$. The latter would actually come out very naturally if one considered state probability vector and its evolvement over a differental time dt
${\displaystyle \mathbf {p} (t+dt)=(\mathbf {I} +\mathbf {Q} dt)\mathbf {p} (t)}$
as then
${\displaystyle d\mathbf {p} =\mathbf {p} (t+dt)-\mathbf {p} (t)=\mathbf {Q} \cdot \mathbf {p} (t)dt}$
and by dividing this difference equation with dt you get the matrix differential equation
${\displaystyle {\frac {\partial \mathbf {p} }{\partial t}}=\mathbf {Q} \cdot \mathbf {p} }$
which seems like an instructive path (for a rusty physicist like me at least) to understand the origin of this nice result. As I said, I am rusty, and I think I may have swapped columns and rows in the state rate matrix, in which case p should be a row vector instead and multiplied from the left, but with the same result.
This is triggered by a question I raised at the Mathematics Reference Desk, where I did not understand the article well enough to realize that in finite time the state-transition probability matrix ${\displaystyle {\boldsymbol {\Pi }}(t)}$ for the homogeneous case with lag t will be just the matrix exponential ${\displaystyle {\boldsymbol {\Pi }}(t)=\exp(\mathbf {Q} t)}$. I was wondering if it would not be relevant to mention this for the important homogeneous case and perhaps also mention how one could do that matrix exponential as function of time? (This is something I am still trying to understand, but isn't the best approach to diagonalize Q (is it always diagonalizable?) as then ${\displaystyle \mathbf {Q} t=\mathbf {U} (\mathbf {D} t)\mathbf {U} ^{-1}}$ and then ${\displaystyle {\boldsymbol {\Pi }}(t)=\mathbf {U} \exp(\mathbf {D} t)\mathbf {U} ^{-1}}$, which mean you need to do one diagonalization, and then you can easily calculate the transition state probability matrix for arbitrary times t as a simple matrix product?) --Slaunger (talk) 07:31, 2 July 2010 (UTC) Q is not diagonalizable. --Slaunger (talk) 10:08, 2 July 2010 (UTC)
Seems relevant to me. I'd vote for including it. Halcyonhazard (talk) 16:47, 2 July 2010 (UTC)

## Strange example: {Bull Market, Bear Market, Recession}

The example given in the article, describing a three-state space consisting of {Bull Market, Bear Market, Recession}, seems rather strange. In their ordinary meaning, the "Recession" state is not mutually distinct from the other two. It is certainly possible to have a Recession during a Bull Market or Bear Market. It would be better to define the three-state space as {Bull Market, Bear Market, Stagnant Market} or {Bull Market, Bear Market, Mixed Market}. —BarrelProof (talk) 20:03, 30 May 2013 (UTC)

I would make that change myself, but it requires re-drawing the figure, and I don't have an easy way to do that. Incidentally, the figure also violates the MOS:NUM guideline that says "Numbers between −1 and +1 require a leading zero (0.02, not .02)". —BarrelProof (talk) 20:12, 30 May 2013 (UTC)
Thanks for these suggestions, I've now updated the image, but seem to be suffering some awkward font issues. I'm unsure why at the moment. Previously I haven't had problems. Gareth Jones (talk) 21:53, 30 May 2013 (UTC)
It looks good to me! Thanks. —BarrelProof (talk) 22:29, 30 May 2013 (UTC)
I just noticed that there's a similar example for the discrete-time case in the Markov chain article. —BarrelProof (talk) 22:33, 30 May 2013 (UTC)
Well spotted, I've updated that too. Gareth Jones (talk) 23:10, 30 May 2013 (UTC)

## Discrepancy correction

In the section about Embedded Markov chains, there seems to be a discrepancy between the definition given in terms of ${\displaystyle s_{ij}}$ and ${\displaystyle S=-D_{Q}^{-1}Q}$. In the first definition, the diagonal values of S are given as zero, but the second definition will have them as 1. I believe the first definition is correct, so we want instead that ${\displaystyle S=-D_{Q}^{-1}Q-I}$, which also makes more sense in the following lines when we try to find the kernel of ${\displaystyle (S-I)}$ for the steady states of the discrete time Markov process. If S is the transition probability matrix of the EMC, we should be trying to find the stationary states of ${\displaystyle S}$, not ${\displaystyle S-I}$. I have gone ahead and made this change.76.120.32.59 (talk) 21:02, 3 June 2013 (UTC)Christopher

Note: I have not checked to see if this change affects the definition of \pi on the following line. This should be verified.76.120.32.59 (talk) 21:04, 3 June 2013 (UTC)Christopher

You're right about getting zeroes being correct, but need ${\displaystyle S=I-\left(\operatorname {diag} (Q)\right)^{-1}Q}$ to get them. I'll change this now. Gareth Jones (talk) 11:59, 4 June 2013 (UTC)

## History

Can anyone shed light on the history of CTMCs? Richard Weber's Markov Chain course notes contain a section titled "Historical notes" (p.52) where a quote from Youschkevitch suggests Markov's own work was limited to the discrete case ("For him the only real examples of the chains were literary texts, where the two states denoted the vowels and consonants"), but I haven't made any further progress. Gareth Jones (talk) 11:06, 10 June 2013 (UTC)

## Formula for P(t) seems incorrect

It is written that

${\displaystyle P(t)=e^{tQ}}$

has elements of the form

${\displaystyle p_{ij}(t)=\delta _{ij}+\sum _{k=1}^{\infty }{\frac {t^{k}q_{ij}^{k}}{k!}}}$

which equals

${\displaystyle p_{ij}(t)=\delta _{ij}+e^{tq_{ij}}-1.}$

By definition of matrix exponential, however:

${\displaystyle P(t)=\sum _{k=0}^{\infty }{\frac {t^{k}Q^{k}}{k!}}.}$

And there is no simple elementwise expression, which is the source of complexity of calculating these. So it seems wrong... bungalo (talk) 14:10, 21 July 2013 (UTC)

Yep, you are right, it is incorrect. I have taken it out of the article. — Preceding unsigned comment added by 75.102.81.97 (talk) 19:02, 28 December 2013 (UTC)