Talk:Markov process

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated Start-class, Mid-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
WikiProject Russia / Science & education (Rated Start-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Russia, a WikiProject dedicated to coverage of Russia on Wikipedia.
To participate: Feel free to edit the article attached to this page, join up at the project page, or contribute to the project discussion.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
Taskforce icon
This article is supported by the science and education in Russia task force.

Needs Examples[edit]

Of both markovian processes and non markovian process. Please include simple clear examples. Scientifically or mathematically important examples. Examples that test the limits of the definition. —Preceding unsigned comment added by (talk) 08:42, 2 February 2011 (UTC)

In addition to the above poster's comment: Also, the existing example of a non-markovian process (the one with the coins in the purse) is not very good. If X6=0.50, then the only possible way is to have one coin of 0.25 and 5 coins of 0.5 - anything else would not give exactly 0.50 with 6 coins. So in that case the argument that knowing only X6 imparts less information than knowing also the history actually does not hold. — Preceding unsigned comment added by (talk) 07:42, 15 July 2014 (UTC)

Isn't the example constructed so that the "state" is only the sum of the values of the current coins? (I.e. that there are six coins, and not some other number that add up to fifty cents, is not included.) Perhaps there could be some connection to the subsequent remarks about Markov representations of non-Markov processes, where we note that if "state" means "total value of coins drawn so far," then this is a non-Markov process, but if "state" means "vector of counts of coin denominations drawn so far," then this is a Markov process. — Preceding unsigned comment added by (talk) 18:41, 22 December 2014 (UTC)

In the example sections shouldn't P be :

P =     |  0.85    0.10  |
          |  0.15    0.90  | 

instead of

P =     |  0.85    0.15  |
          |  0.10    0.90  |

Since the column vector should add up to be 1. — Preceding unsigned comment added by (talk) 16:40, 3 August 2014 (UTC)

Article needs major overhaul[edit]

I will just point out some inconsistencies and mistakes in this article:

A Markov process is, as translations from Russian state, "a process without after-effect" This sounds as if the word 'Markov' translates to 'without after-effect'. In fact Markov is the name of a Russian mathematician. It should be made clear to which original text the translation refers to.

From the standard definitions of drift and diffusion coefficients this means that those coefficients may depend at most on a single point (x, t) and on no history at all. Some author seems to have mixed the concepts of 'Ito-process' and a 'Markov-process'. There is no 'drift and diffusion coefficient' of a general Markov process. The whole first paragraph is very confusing and seems to be more about Ito-processes than about Markov-processes

The Markov property is stated twice with different definitions. -- (talk) 13:58, 26 March 2008 (UTC)


This article should be merged with the article Markov chain. As far as I can tell, this article doesn't actually say anything that the Markov chain article doesn't say. linas 15:56, 4 September 2005 (UTC)

I agree that the content may be related, but I think having a separate entry for Markov processes is still useful for the layperson. (Anonymous user) 23:37, 22 September 2005 (PST).
?? I don't get the sense that this aricle is somehow "easier to understand" than the other one. linas 16:18, 23 September 2005 (UTC)
Laypersons looking for a Markov process will be redirected to Markov chain when the redirect is in place --Anthony Liekens 14:43, 10 October 2005 (UTC)
I think it should be merged with Markov property not with markov chain. Markov property
the contents of this article discusses the Markov property, but it's title is more related to a Markov chain, not the Markov property --Anthony Liekens 14:43, 10 October 2005 (UTC)
Agree, just putting up a redirect to Markov chain would be sufficient. By the way, there's a lot of work to be done in Markov chain itself! --Anthony Liekens 14:43, 10 October 2005 (UTC)
It makes sense to fold Process into the Property entry, then have a link called Process. But I would leave the Chain entry alone unless someone has a clear idea how to integrate Process, Property, Chain, and perhaps HMM, MDP, and POMDP. --Randy Crawford 20:55, 10 October 2005 (UTC)
At the very least, somewhere there needs to be either an entry or subsection for 'Markov process.' Any textbook on Stochastic Processes usually has an entire chapter devoted to 'Markov processes'. Furthermore, not all Markov processes are Markov chains (e.g. a process that is not stationary or homogeneous in time); redirecting the entry for the 'process' to an entry for one particular class of that process would only confuse and bias the reader. If it is to be redirected, it ought to be to folded into Markov property and not Markov chain. --H. C. Hodges 15:52, 22 October 2005 (PST).
Markov property seems a nicer article than Markov process, but since the Markov property is a property of processes, an article titled Markov process and also covering the Markov property seems best. Then, Markov property could redirect there. Merging this with Markov chain would be a mistake, for the reasons given by H. C. Hodges. Ben Cairns 13:39, 23 October 2005 (UTC).

I have clarified the second paragraph slightly. LS

Introductory sentence[edit]

It would be good to have an introductory sentence that introduces the subject to a layperson. For example, see [1]. Jim 18:51, 24 March 2006 (UTC)

Continuous Time[edit]

For mathematicians, usually Markov process means continuous time, whereas Markov chain means discrete time. Markov processes are often used in chemistry and biology, and their properties are different than Markov chains -- would be nice to have a separate entry. Leboudec 16:24, 15 May 2006 (UTC)

Greetings Leboudec. Are you Jean-Yves Le Boudec, author of "Fluid Approximations of Continuous Time Markov Chains"? I've generally heard continuous time Markov processes on discrete state spaces described as Markov chains. LachlanA 02:21, 30 July 2007 (UTC)

Continuous Value[edit]

Markov chain is only for countable set of states. I thought Markov process might contain continuous valued version. (c.f. Mathworld [2]) Memming 13:15, 24 September 2006 (UTC)

Markov chains, processes, etc[edit]

Are not there too many pages about the same thing? M. chain,, continuous time M.process, etc. I propose the following: 1) merge M. process with M. property; this will include the def. both for discrete and cont. time 2) then there will be a link to M. chain for properties specific for discrete time, and to cont.-time M. proc. - for properties specific for cont. time 3) To the latter (cont. time) one should add geometrical properties such as the Bakry-Emery condition. Convergence to equilibrium should also appear somewhere - maybe, in the first page (M. process?)

What do you think? Sodin 14:38, 1 March 2007 (UTC)

Please excuse me asking yet another question here instead of helping directly with yours, Sodin. Can the term 'Markov Chain' also relate to a continuous time process if it has a discrete state space? If so, the MC page would need to deal with both continuous and discrete time. Trog23 17:10, 3 March 2007 (UTC)
I think a chain is with discrete time and any state space. The division between discrete and continuous state space is a bit obscure: the state space can be any, say, topological space (whereas the time should be either Z or Z_+ or R or R_+). Sodin 01:06, 4 March 2007 (UTC)
OK, thanks. Your definition agrees with the present text on both the 'Markov process' and 'Markov Chain' pages, and with the Oxford Users' Guide to Mathematics (ISBN 0-19-85073-1 section 6.4.2 p 1039) which defines a MC in terms of a discrete time parameter. However Norris ('Markov Chains', Cambridge University Press, ISBN 0-521-63396-6) states that “it is usual to refer to [a MP with a discrete state space] as a Markov Chain”, which is in agreement with Michael Hardy's recent comment on the Continuous-Time MP talk page. So, there seems to be some inconsistency here. Surely a discrete sequence of values would be generated if either the time parameter or the state space (or both) were discrete, and it should be reasonable to describe either case as a 'chain'? Regards Trog23 17:55, 5 March 2007 (UTC)

I agree that merging the content from Markov property into this page would be suitable. LachlanA 02:25, 30 July 2007 (UTC)

Yes, one of us has to get to work and clean up the Markov sections. I suggest one entry called Markov Processes. Under this heading we say that "chain" refers to discrete time (see Meyn & Tweedie 2005, [1]). Most of the chapter heading will deal with discrete time, but one subsection can be entitled continuous time. This will contain a link to a wiki diffusions entry.

Spmeyn (talk) 15:25, 11 December 2007 (UTC)
I did this a while ago, and changed the pages pertaining to these confusions. Instead of merging anything (which I think is not warranted) I followed literature which seems to decree on Markov chains being discrete time, on any state space (although I allowed the article Markov chain to focus on discrete or countable state spaces). I hope all definitions and terminology are now correct and consistent at the articles Markov chain, Markov process, Continuous-time Markov chain, Markov property and Markov model.
This page, Markov process, can serve as a top level page for most general stochastic processes that have the Markov property. I added a more lengthy introduction with links to the other articles (in addition to the former articles, also to Harris chain, Markov model, Continuous-time Markov chain and continuous stochastic process; see also Continuous-time stochastic process), giving this page enough justification for its existence in my opinion. To improve it further we may add sections for each of the different Markov processes, which provide a short introduction to the concepts, emphasising their differences with similar notation and providing "main article" links to their respective articles. --Ecov (talk) 21:19, 20 January 2014 (UTC)

Merge all, separate sections for each[edit]

Why not merge all pages but maintain them in separate sections on the Markov Chain page? —Preceding unsigned comment added by (talk) 18:17, 8 October 2007 (UTC)

  • Suppport merges. way too many... Dicklyon (talk) 02:11, 12 December 2007 (UTC)
  • Support merge. Too many, and 50% of this article is a wandering digression into other processes Professor marginalia (talk) 19:45, 2 January 2008 (UTC)

markov processes that are not dynamical systems?[edit]

In the back of my mind, I tend to draw this (possibly incorrect) mental model of a Markov process as some function on a measure-preserving dynamical system, but where the underlying time evolution is not made explicit (or is not explicitly known). I think my mental model is correct (?) ... but it occurs to me: are there *any* examples of Markov processes that cannot be formulated as functions on a measure-preserving dynamical system? If so, can examples of these be given? linas (talk) 00:15, 10 April 2008 (UTC)

Still an open question, at least for me. When a markov process goes stationary, the measure would be the Haar measure. Today's version of this article now defines a markov process in terms of a filtration instead of a semigroup, but I guess filtrations indexed by natural numbers still form a semi-group, so its still a Haar measure..!? That's what my intuition says; I've never seen this written up in pedagogical detail; any reference to this would be nice. linas (talk) 16:49, 17 July 2012 (UTC)


I found this article fairly messy and had a go at cleaning it up (I was not signed in—sorry); an expert in the area should validate that it's still correct. In particular I removed what appeared to be an extra formal definition; my understanding was that this extra definition was equivalent to the first. If the extra definition is needed for some reason, please explain the difference between the two. Ezrakilty (talk) 21:05, 19 October 2008 (UTC)

Having now caught up on the competition, I support the merger of Markov process and Markov chain. The latter is much more detailed and clear, and seems to encompass all of the former. —Preceding unsigned comment added by Ezrakilty (talkcontribs) 21:09, 19 October 2008 (UTC)

Well, the passage of time has rendered the discussion obsolete. The Markov chain article deals with finite probability spaces, i.e. transition matricies. This article deals with the case where the probability space is a continuum. While conceptually these are almost identical, there is a large shift in terminology, notation; so e.g. the use of sigma-algebras becomes obligatory instead of optional. linas (talk) 16:42, 17 July 2012 (UTC)

Taylor expansions in time?[edit]

Is it true that given a Markovian process with probability density p[X,t], one can take the Taylor expansion in time and, in a limiting procedure, get rid of the 2nd and higher-order terms, like so,

 p[X,t+\tau] = F
 p[X,t] + \dot{p}[X,t]\tau + \frac{1}{2}\ddot{p}[X,t]\tau^2 + \cdots = F
 \dot{p}[X,t]\tau \approx -p[X,t] + F
 \dot{p}[X,t] = \lim_{\tau\rightarrow0} \frac{-p[X,t] + F}{\tau}

whereas, for a non-Markovian process, one cannot necessarily do this? In other words, to the increasingly higher-order time terms reach information about states in the increasingly distant past? (If so, this could be useful information to put in the Taylor series article.) Zeroparallax (talk) 07:05, 15 February 2009 (UTC)

Formal Definition[edit]

I think that the current "formal definition" is a little weak. Given that the "state" is an uncountable set, with the current definition one may construct a processes that is not Markovian, but satisfies the condition. A better definition would be that X=\{X_t:t\geq 0\} is a Markov process if for each B \in \mathcal{B} whenever s\leq t we have that \mathbb{P}(X_t \in B | \sigma(X_k: k\leq s))=\mathbb{P}(X_t \in B | \sigma(X_s))

Does anyone agree? —Preceding unsigned comment added by (talk) 10:16, 31 October 2009 (UTC)


The article mentions a certain case that is a second-order Markov process, but no where does it say what would be a first-order Markov process. If it is going to talk about "order" at all, it should define "order". DMJ001 (talk) 04:55, 11 August 2011 (UTC)

Gibbs measure[edit]

FWIW I put Gibbs measure into the see-also list, but really, a whole section should be devoted to this: all markov processes are gibbs measures! The way to see this is that the markov condition means that the markov process is time-invariant, i.e. that it has a 1-dimensional translational symmetry on the lattice of points in time. Symmetry implies a conserved quantity: what, in physics, would be called "energy". The corresponding lagrange multiplier provides the mechanism for holding this constant, and this just means that any markov process it can be written as a gibbs measure built from a partition function (mathematics).

I don't have any references at the tip of my finger that explain this, and am fain to devote time to this; however, I believe that hunting through the usual 'intro to ergodic theory' or 'intro to dynamical systems' will find one or another treatment of this in detail. Allow me a moment to be bombastic, and say that this really shouldn't be mysterious; this really should be stated as a "basic fact" about Markov processes. linas (talk) 15:31, 17 July 2012 (UTC)

The article Markov property hints at above by talking about the memorylessness of the exponential distribution, without explaining why. A very quick and dirty sketch would be to consider a probability space P and its Cartesian product P^\omega = \lim _{N\to\infty} P^N i.e. its a lattice model. The Markov property says that the only interaction is between nearest neighbors. Thus, the graph aka Markov blanket decomposes into a very simple form, i.e. \sum_i V(P_i) and the only thing that there is that goes between products and sums is the exponential, i.e. the gibbs measure. for more complex markov blankets, one uses subshift of finite type to get the same. The real issue is that there are so many different kinds of notations in use, e.g. measures on cylinder sets for caretesian products, and etc. that its not always clear that these different concepts are all really the same thing. A rosetta stone translating between them, in some markov-article or another would be ideal... linas (talk) 16:03, 17 July 2012 (UTC)

And, as I write this, it all begs a question: what would be the categorical approach be? So, instead of a Cartesian product (category theory) for the probability space, I'm thinking that one would use a closed monoidal category. Why? Well, for starters, it would allow for the tensoring of hilbert spaces, instead of just real-valued probability spaces. Well, why would one want to do that? As Gromov has recently written in his 'search for structure', the Hessian of the entropy is the Fisher information metric, and lo-n-behold, it is the flat metric on the square-root of the probability(!!!), which really points at, donk-be-donk, monoidal categories, and not cartesian categories, as the foundation for probability theory. The mind boggles. Unfortunately, a google search for 'categorical measure theory' makes it clear that this theory is all "under construction", and I don't think the measure theorists have spotted this yet...
At any rate, this also hints that the reason there's a log in the entropy, and the reason that the memory-less-ness of markov measures leads to the exponential distribution are really just two different faces of the same phenomenon: its a translation invariance on a lattice, i.e. its just the law of large numbers/bernoulli measure. One is just covering the space with a bunch of balls/cubes a la kolomogorov. But maybe I'm hallucinating...linas (talk) 16:23, 17 July 2012 (UTC)