# Talk:Stochastic matrix

WikiProject Statistics (Rated Start-class, Mid-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start  This article has been rated as Start-Class on the quality scale.
Mid  This article has been rated as Mid-importance on the importance scale.
WikiProject Mathematics (Rated Start-class, Mid-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 Start Class
 Mid Importance
Field: Probability and statistics

Well, is it the rows or the columns that have to individually sum to 1? See also these edits. Melchoir 08:14, 28 December 2005 (UTC)

It's the columns. See Mathworld. Chris 21:21, 31 December 2005 (UTC)

Not true. A stochastic matrix can either be row stochastic (where the rows each sum to 1) or column stochastic (where the columns each sum to 1). With a row stochastic matrix, the probability vectors are row vectors, you left-multiply the matrix by the vector. With a column stochastic matrix, the vectors are column vectors, and you right-multiply the matrix by the vector. For example, for a Markov chain described by stochastic matrix $A$ and probability vector $\mathbf{x}_k$ at time $k$, you would write
$\mathbf{x}_{k+1} = A\mathbf{x}_k$
if the matrix were column stochastic and $\mathbf{x}_k$ were a column vector, and
$\mathbf{x}_{k+1} = \mathbf{x}_k A$
if the matrix were row stochastic and $\mathbf{x}_k$ were a row vector. Mateoee 18:41, 17 November 2006 (UTC)

## Regular stochastic matrices

I suggest that the sentence (definition) A stochastic matrix P is regular ...' is rewritten in a form that avoids any confusion between the regularity of stochastic matrices used in this article and the regularity (= invertibility) of matrices in general.

I removed the link to the Invertible Matrix article, which hopefully helps with the ambiguity. I'm not an expert on this at this point, but it's certainly clear that the definition of regular stochastic matricies used here has nothing to do with invertibility. It might be useful to use another term (I've seen the term "ergodic", for example, used to describe such stochastic matricies) to avoid any other confusion, but I'm not yet sufficiently up to speed on this to feel comfortable making the edit. Mateoee 17:50, 20 November 2006 (UTC)

## Right vs. left convention

It is a convention in English language mathematic literature to use the right or row sum of 1 version.

However, then in the sections below, I think right (row sum) matrices are used in the section "Definition and properties", and then left (column sum) matrices in the section "Example computation: the cat and the mouse". The "Definition and properties" doesn't even say that it is using right stochastic matrices, it talks of "**the** stochastic matrix". So, I have two suggestions. First, if the intro says that right stochastic matrices are the default convention, use right stochastic matrices everywhere on this page. Second, state explicitly that this page is using that convention. Bayle Shanks 21:56, 14 May 2007 (UTC)

## Multiple absorbing states

What about matrices with more that one absorbing state? For instance a game with two possible final outcomes, like if the mouse could win if they both land in the middle bin. It would be interesting to see how to solve for the probability of each outcome. Anybody interested in doing something like that? --W0lfie (talk) 21:28, 13 December 2007 (UTC)

Multiple sink states of a matrix M can be solved easily by creating a sub square matrix N that is matrix M with all the sink rows and columns removed, then for a particular steady state $S_j$ just solve $S_j = -(N-I)^{-1}M_{j}$, where $M_j$ is the jth column vector of M with all the sink rows removed.

For example the matrix

 1.000   0.000   0.000   0.000   0.000   0.000
0.070   0.166   0.147   0.266   0.214   0.138
0.143   0.175   0.085   0.185   0.201   0.211
0.000   0.000   0.000   1.000   0.000   0.000
0.201   0.247   0.215   0.188   0.144   0.005
0.000   0.000   0.000   0.000   0.000   1.000


has sinks on rows 0,3,5, and the steady state of:

 1.000   0.000   0.000   0.000   0.000   0.000
0.228   0.000   0.000   0.510   0.000   0.262
0.282   0.000   0.000   0.403   0.000   0.316
0.000   0.000   0.000   1.000   0.000   0.000
0.371   0.000   0.000   0.468   0.000   0.160
0.000   0.000   0.000   0.000   0.000   1.000


the sub matrix N with rows and columns 0,3,5 removed is

 0.166   0.147   0.214
0.175   0.085   0.201
0.247   0.215   0.144


and $M_3$ is column 3 of M with rows 0,3,and 5 removed

 0.266
0.185
0.188


and the calculation of the steady state of sink state $S_3 = -(N-I)^{-1}M_{3}$ is:

 0.510
0.403
0.468
`

Obviously this can be extended to the total steady state vector by adding zeros and ones at the sink states 0->0 3->1 5->0 to get the above column 3 from the steady state matrix. I use zero indexing by the way. It can probably be proven that N-I is invertible because M is stochastic, M-I is singular, so the sub matrix must not be singular. Antares5245 (talk) 12:31, 6 April 2010 (UTC)

## Confusing tag

I have put the confusing tag, in regards to part of the article. I have asked questions regarding that in the Mathematics reference desk. --Obsolete.fax (talk) 05:04, 25 January 2008 (UTC)

What specifically is confusing? I agree that the example could be a little clearer, especially the part about finding the expected time. That might not even need to be in the article.
I undid your revision because it had two negative consequences. By swapping the 2nd and 3rd states, you must swap all of the probabilities in the 2nd and 3rd columns, not just the 2nd and 3rd rows. Otherwise you end up with two absorbing states, which is incorrect in this example. Also, the sub-matrix in the next part would need to change as well to be consistent with the original matrix.--W0lfie (talk) 23:06, 29 January 2008 (UTC)
One more thing. Surely there is a simpler example besides the cat and mouse game that we could use. That might alleviate some of the confusion. But what? Are there any three state games that are cross-cultural and easy to explain? --W0lfie (talk) 23:15, 29 January 2008 (UTC)

## Matrix

I am trying to understand Stochastic matrix.

The article has an example of the cat and mouse:

1. You have a timer and a row of five adjacent boxes.

2. The cat is in the first box.

3. The mouse is in the fifth box.

5. If the cat is in the second box and the mouse in the fourth one, the probability is one fourth that the cat will be in the first box and the mouse in the fifth after the timer advances.

So in the above example the article generates some jargon:

$P = \begin{bmatrix} 0 & 0 & 1/2 & 0 & 1/2 \\ 0 & 0 & 1 & 0 & 0 \\ 1/4 & 1/4 & 0 & 1/4 & 1/4 \\ 0 & 0 & 1/2 & 0 & 1/2 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}.$

• 2. What does above mean in simple mathematics? --Obsolete.fax (talk) 04:54, 25 January 2008 (UTC)

• 4. If State 1: cat in the first box, mouse in the third box: (1, 3); then why is there 1/2 represented on the 3rd and 5th segment of the first line in the above? --Obsolete.fax (talk) 04:58, 25 January 2008 (UTC)

• 5. Is State 1 represented on the first line in the above? --Obsolete.fax (talk) 04:59, 25 January 2008 (UTC)
• Ok so P is a matrix that shows, for each pair of states the before and after. Would the word "matrix" mean the visual mathematic diagram? And thus "P" stands for the "matrix"? --Obsolete.fax (talk) 07:31, 26 January 2008 (UTC)
Yes, but it is more than just a "diagram". In mathematics, a matrix is a rectangular array of numbers or other entries. Before you can understand stochastic matrix, you need to understand how a matrix is defined, and how two matrices can be added or multiplied together. All this is explained in our article matrix (mathematics). Gandalf61 (talk) 09:56, 26 January 2008 (UTC)
A matrix is a kind of mathematical object, just like number, vector, etcetera. (There are also mathematical objects called diagrams, but they do not include matrices.) The presentation of a matrix as a rectangular table is a traditional convenient way of representing them on paper, but it is just a representation of the matrix and not the mathematical object itself, just like the sequence of digits "144" represents a number but is not that number. In the article Stochastic matrix the variable P stands indeed for the matrix displayed as the right-hand side in the definition P = .... --Lambiam 10:12, 26 January 2008 (UTC)
$P = \begin{bmatrix} 0 & 0 & 1/2 & 0 & 1/2 \\ 0 & 0 & 1 & 0 & 0 \\ 1/4 & 1/4 & 0 & 1/4 & 1/4 \\ 0 & 0 & 1/2 & 0 & 1/2 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}.$
• State 1: cat in the first box, mouse in the third box: (1, 3)
• State 2: cat in the first box, mouse in the fifth box: (1, 5)
• State 3: cat in the second box, mouse in the fourth box: (2, 4)
• State 4: cat in the third box, mouse in the fifth box: (3, 5)
• State 5: the cat ate the mouse and the game ended: F.

The before state with the cat in the second box and the mouse in the fourth box is State 3. The after state with the cat in the first box and the mouse in the fifth box is State 2. The probability of transitioning, given that the system is in State 3, to State 2, is the entry found in row 3, column 2, of P, which is 1/4. From State 3 the system can go into each of States 1, 2, 4 and 5, and these four possible transitions all have equal probability. From State 1 the system can only transition to States 3 and 5, because the cat must move to the second box; depending on whether the mouse goes right or left it survives (State 3) or is eaten (State 5). --Lambiam 09:16, 25 January 2008 (UTC)

Can you elaborate, please? I don't understand where State 1 is in your explanation? --Obsolete.fax (talk) 13:07, 26 January 2008 (UTC)

Okay, let's go over this again. In State 1, the cat is in box 1 and the mouse is in box 3. From State 1, two alternatives can happen. The cat can go to box 2 and the mouse can go to box 4 - this takes us to State 3. Or the cat can go to box 2 and the mouse can also go to box 2 - this takes us to State 5. These two alternatives have equal probability. Is that clear ? Do you see why the entries in the first row of the matrix P are 0, 0, 1/2, 0, 1/2 ? Gandalf61 (talk) 18:13, 27 January 2008 (UTC)
I followed you so far Gandalf61. Please could you explain the following:
Is it true that entries in the first row, represent what will happen after the corresponding State, in this case, State 1?

Is it true that entries in the first row, represent what will happen after the corresponding State, in this case, State 1?
Is it true that State 2 and State 4 which is mentioned earlier cannot happen after State 1?

Yes, the entries in the first row of the matrix represent what will happen is the next time step immediately after the system in in State 1. So immediately after State 1, the system can move to State 3, with probability 1/2, or to State 5, with probability 1/2.

Yes, the system cannot move immediately from State 1 to State 2 or State 4 - to do this, the mouse would have to move from box 3 to box 5, which it cannot do in one time step. Of course, the system could go from State 1 to, say, State 2 in several time steps - it could, for example, go from State 1 to State 3, and then from State 3 to State 2. But the matrix P represents the state changes that the system can make in just one time step, so when we construct P we can ignore state changes that require more than one time step. Gandalf61 (talk) 10:17, 28 January 2008 (UTC)

If the entries in the first row of the matrix represent what will happen is the next time step immediately after the system in in State 1. So immediately after State 1, the system can move to State 3, with probability 1/2, or to State 5, with probability 1/2. What do the entries in the second row represent? --Obsolete.fax (talk) 12:09, 28 January 2008 (UTC)
Out of curiosity, why shouldn't the article represent State 3 as State 2, (and State 2 as State 3), as State 3 occurs before State 2? --Obsolete.fax (talk) 12:09, 28 January 2008 (UTC)
The entries in the second row of the matrix represent what can happen in the next time step immediately after the system is in State 2. From State 2 the system can only go to State 3 - the cat can only move to box 2 and the mouse can only move to box 4. So after State 2 the system goes to State 3 with probability 1 - so the only non-zero entry in row 2 of the matrix is an entry of 1 in column 3.
There is nothing special about the way the states are numbered, as long as a consistent numbering convention is used throughout. States 2 and 3 could indeed be interchanged - in the probaility matrix, this would be equivalent to interchanging rows 2 and 3 and also interchanging columns 2 and 3. Gandalf61 (talk) 17:14, 28 January 2008 (UTC)
I interchanged the State 2 with State 3 in the article and vice versa. Why the entries in the first row of the matrix P are 0, 0, 1/2, 0, 1/2? Shouldn't it be 0, 1/2, 0, 1/2, 0? Per the assumption that the probabilities in the First Row represent what will happen after State 1, and the columns represent where probably the mouse will go? Thus the cat in the second box, mouse in the fourth box? --Obsolete.fax (talk) 16:08, 29 January 2008 (UTC)
Really not a good idea to start changing the labels on the states. Firstly, you will get confused between the two different sets of state definitions. Secondly, you only interchanged rows 2 and 3 in the matrix - to properly relabel the states you should also have interchanged columns 2 and 3. Remember that the state numbers are linked to the order of the columns in the matrix as well as the order of its rows. Anyway, someone has reverted your changes to the article. I strongly suggest you leave the article alone - if you want to discuss other examples, let's do that here on your talk page.
Going back to the original state definitions, the first row of the matrix is 0, 0, 1/2, 0, 1/2 because the only states you can reach in one time step starting from State 1 are State 3 or State 5. You can't move to State 1, State 2 or State 4 in one time step from State 1, so the entries in columns 1, 2 and 4 in row 1 are 0. Do you understand this ? If you don't follow this, read my post above that starts "Okay, let's go over this again ...". Gandalf61 (talk) 07:26, 30 January 2008 (UTC)

Sorry about messing with the article. I understand now that the row in the 1st column represents what will happen after state 1 and the probability in the columns represent which state the matrix will go into. --Obsolete.fax (talk) 08:31, 30 January 2008 (UTC)
Please don't worry about changing the article. You made your changes in good faith, so there is nothing to be sorry about. Remember that one of the first rules of Wikipedia is to be BOLD! Anyway, your points have got me thinking about some clarifications that might be added to the article. I'll try to take a crack at it sometime in the next few weeks. --W0lfie (talk) 23:36, 7 February 2008 (UTC)
Getting back to the original question you had. I think the source of all this confusion is that you expected the state in the first row to represent the initial state. Am I right? It is stated in the article: the initial state is actually "State 2" as represented in the matrix. That means, it is the second row.
So at first you are in the second row. Since it's only got a 1 in the third column, you know that at the next time step you will be in the third row. Since the third row has a 1/4 in columns 1-2-4-5, you have a 1/4 chance of being in any of those states in the next time step. Does that help? --W0lfie (talk) 23:36, 7 February 2008 (UTC)

The row in the 1st column represents what will happen after state 1 and the probability in the columns represent which state the matrix will go into. There is 50% chance it will go into State 3, and 50% chance it will go into State 5. State 1 = (1, 3) ==> State 3 = (2, 4); State 5: (2, 2). I just went through the whole matrix and understood it :).--Obsolete.fax (talk) 03:13, 11 February 2008 (UTC)

The cat and the mouse is a terrible example. Not only does a novice reader need to keep track of two entities (cat and mouse) and 5 boxes and a 5 by 5 matrix. That's so confusing in addition to figuring out what the hell is going on. 122.107.129.141 (talk) 01:33, 16 February 2008 (UTC)

## Confusing

The Stochastic matrix article states in the example:

The initial state of the system at time zero is state 2, which we represent by the Probability vector v:

$\mathbf{v} = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \end{bmatrix}.$

Is there any point to the above other than to say each row must add up to 1? --Obsolete.fax (talk) 08:51, 30 January 2008 (UTC)

I don't see why you think the initial probability vector is connected with the property that the entries in each row of the stochastic matrix add up to 1. The initial probability vector is simply a statement of the initial state of the system. If the system started in State 1 then the initial probability vector would be :$\begin{bmatrix} 1 & 0 & 0 & 0 & 0 \end{bmatrix}.$ --Gandalf61 (talk) 14:53, 30 January 2008 (UTC)

Why would the initial state of the system at time zero be State 2 and not State 1 the initial state?

It then goes on to state:

The probability distribution of the states at time k is then given by

$\mathbf{p}_k = \mathbf{v}P^k \, .$

How does the article arrive at this formula? --Obsolete.fax (talk) 08:51, 30 January 2008 (UTC)

The formula for the probability distribution after k time steps is derived by induction. At the start, after 0 time steps, the probability distribution is
$\mathbf{p}_0 = \mathbf{v}$
... this is just the definition of $\mathbf{v}$. After 1 time step the probability distribution is

Could you please explain: The formula for the probability distribution after k time steps is derived by mathematical induction. I don't understand the meaning from the articles I wikilinked. --Obsolete.fax (talk) 20:25, 7 February 2008 (UTC)

And since the v which is 1 could also be represented in the formula as:
$\mathbf{p}_k = \mathbf1P^k \, .$

which can be represented by:

$\mathbf{p}_k = P^k .$

Are my above deductions assumptions correct? --Obsolete.fax (talk) 08:53, 30 January 2008 (UTC)

The left hand side of this equation is a row vector and the right hand side is a square matrix, so this equation cannot be correct.
$\mathbf{p}_1 = \mathbf{v}P$
which follows from the definition of P. To move the system forward by one more time step we multiply $\mathbf{p}_1$ by P again, so we get
$\mathbf{p}_2 = \mathbf{p}_1P = \mathbf{v}P^2 \, .$
Continuing like this, after k time steps we have
$\mathbf{p}_k = \mathbf{p}_{k-1}P = \mathbf{v}P^k \, .$
Notice that $\mathbf{v}$ is a row vector - it is not just the number 1. So it is incorrect to say
$\mathbf{p}_k = \mathbf1P^k = P^k\, .$

--Gandalf61 (talk) 14:53, 30 January 2008 (UTC)

I'd have to agree with Obsolete's point that it's not entirely clear where the solution for k comes from unless you have a lot of background in this sort of thing. But then again, to go into this long a proof in the article would introduce a lot of complexity that we might rather avoid. --W0lfie (talk) 23:36, 7 February 2008 (UTC)

## Stochastic matrix Question 1

The Stochastic matrix article states in the example:

Why would the initial state of the system at time zero be State 2 and not State 1 the initial state? --Obsolete.fax (talk) 03:41, 11 February 2008 (UTC)

The initial state as given in the article: "Suppose you have a timer and a row of five adjacent boxes, with a cat in the first box and a mouse in the fifth one at time zero."
That is State 2 as the article defines it: "State 2: cat in the first box, mouse in the fifth box: (1, 5)."
--Lambiam 05:14, 11 February 2008 (UTC)
Yes, I follow you. Thanks for clarifying that.--Obsolete.fax (talk) 04:04, 12 February 2008 (UTC)

The initial state of the system at time zero is state 2, which we represent by the Probability vector v:

$\mathbf{v} = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \end{bmatrix}.$

Is there any point to the above other than to say each row must add up to 1? --Obsolete.fax (talk) 08:51, 30 January 2008 (UTC)

I don't see why you think the initial probability vector is connected with the property that the entries in each row of the stochastic matrix add up to 1. The initial probability vector is simply a statement of the initial state of the system. If the system started in State 1 then the initial probability vector would be :$\begin{bmatrix} 1 & 0 & 0 & 0 & 0 \end{bmatrix}.$ --Gandalf61 (talk) 14:53, 30 January 2008 (UTC)

So there is no correlation to it adding up to 1? --Obsolete.fax (talk) 03:55, 12 February 2008 (UTC)

What do you mean by "correlation"? All probability vectors must add up to one. If they represent a pure state, only one entry is non-zero, namely the one corresponding to that state.  --Lambiam 07:38, 12 February 2008 (UTC)
So probability vector adds up to 1. I am getting confused; Gandalf mentions that he doesn't see why I think the initial probability vector is connected with the property that the entries in each row of the stochastic matrix add up to 1! Am I missing something here? --Obsolete.fax (talk) 11:54, 12 February 2008 (UTC)
There may be a misconnect over the meaning of the word "connect". There are many vectors that add up to 1. Only one of those represents the system being in the initial state. Your earlier question is a bit as if someone, after having been told that Funafuti is the capital of Tuvalu, asks if there is any point to that statement other than that Funafuti is a settlement. Well, yes, there is. The property of being a settlement is only extremely tenuously "connected" to the property of being the capital of Tuvalu: there are hundreds of thousands of settlements, but only one of those is the capital of Tuvalu.  --Lambiam 09:18, 16 February 2008 (UTC)

## Stochastic matrix Question 2

The Stochastic matrix article states in the example:

It then goes on to state:

The probability distribution of the states at time k is then given by

$\mathbf{p}_k = \mathbf{v}P^k \, .$

How does the article arrive at this formula? --Obsolete.fax (talk) 08:51, 30 January 2008 (UTC)

The formula for the probability distribution after k time steps is derived by induction. At the start, after 0 time steps, the probability distribution is
$\mathbf{p}_0 = \mathbf{v}$
... this is just the definition of $\mathbf{v}$. After 1 time step the probability distribution is

Could you please explain: The formula for the probability distribution after k time steps is derived by mathematical induction. I don't understand the meaning from the articles I wikilinked. --Obsolete.fax (talk) 20:25, 7 February 2008 (UTC)

I'd have to agree with Obsolete's point that it's not entirely clear where the solution for k comes from unless you have a lot of background in this sort of thing. But then again, to go into this long a proof in the article would introduce a lot of complexity that we might rather avoid. --W0lfie (talk) 23:36, 7 February 2008 (UTC)
The probability distribution of the states at time 0 is given by
$\mathbf{p}_0 = \mathbf{v}\,.$
The probability distribution of the states at time 1 is given by
$\mathbf{p}_1 = \mathbf{p}_0 P\,.$
The probability distribution of the states at time 2 is given by
$\mathbf{p}_2 = \mathbf{p}_1 P\,.$
The probability distribution of the states at time 3 is given by
$\mathbf{p}_3 = \mathbf{p}_2 P\,.$
And so on. So, for example,
$\mathbf{p}_3 = \mathbf{p}_2 P = \mathbf{p}_1 P^2 = \mathbf{p}_0 P^3 = \mathbf{v} P^3\,.$
This generalizes to other values than 3.  --Lambiam 05:21, 11 February 2008 (UTC)
I do apologize if I am not following you or Gandalf. But where are you deriving the formulas from? And what is the purpose of the formulas which is described? --Obsolete.fax (talk) 04:00, 12 February 2008 (UTC)
What are the things you do understand that we can build on? Do you understand the concept of probability vector and its relation to discrete probability distributions? Are you familiar with matrix multiplication?  --Lambiam 08:48, 12 February 2008 (UTC)
I am sorry if I am not able to grasp mathematics as easily as you, but I thank you for your consideration. The article Probability vector mentions the following: In mathematics and statistics, a probability vector or stochastic vector is a vector with non-negative entries that add up to one. Now why would it have a relation to discrete probability distribution? --Obsolete.fax (talk) 12:03, 12 February 2008 (UTC)
The positions (indices) of a probability vector represent the possible outcomes of a discrete random variable, and the vector gives us the probability mass function of that random variable, which is the standard way of characterizing a discrete probability distribution. In the example, the possible outcomes form the set of states, where position i of the vector represents State i, and so the value of element i of the vector is the probability of State i occurring.  --Lambiam 08:51, 16 February 2008 (UTC)

This is just a small thing, but could you stop emphasizing things in blue? Please? Black Carrot (talk) 18:53, 11 February 2008 (UTC)

I do apologize if its causing you inconvenience, but you see, otherwise my questions would not stand out, and it would be very difficult for the person with the knowledge to answer it, as that person would have to go through a lot of text to find where is the question.--Obsolete.fax (talk) 03:57, 12 February 2008 (UTC)
Do you really want an answer from someone who hasn't read the discussion? It's not unusual for threads to cover a page or more, and we seem to manage. Black Carrot (talk) 07:47, 12 February 2008 (UTC)
Oh wow, I just saw where my post wound up. That's weird. Black Carrot (talk) 07:48, 12 February 2008 (UTC)
Yes, I was confused at first too. I eventually realised that Obsolete.fax has created 3 sub-pages of Talk:Stochastic matrix and then transcluded them to the article talk page and to the Ref Desk. Gandalf61 (talk) 09:48, 12 February 2008 (UTC)

## Stochastic matrix Question 3

The Stochastic matrix article states in the example:

And since the v which is 1 could also be represented in the formula as:

$\mathbf{p}_k = \mathbf1P^k \, .$

which can be represented by:

$\mathbf{p}_k = P^k .$

Are my above deductions assumptions correct? --Obsolete.fax (talk) 08:53, 30 January 2008 (UTC)

The left hand side of this equation is a row vector and the right hand side is a square matrix, so this equation cannot be correct.
$\mathbf{p}_1 = \mathbf{v}P$
which follows from the definition of P. To move the system forward by one more time step we multiply $\mathbf{p}_1$ by P again, so we get
$\mathbf{p}_2 = \mathbf{p}_1P = \mathbf{v}P^2 \, .$
Continuing like this, after k time steps we have
$\mathbf{p}_k = \mathbf{p}_{k-1}P = \mathbf{v}P^k \, .$
Notice that $\mathbf{v}$ is a row vector - it is not just the number 1. So it is incorrect to say
$\mathbf{p}_k = \mathbf1P^k = P^k\, .$
Gandalf61 (talk) 14:53, 30 January 2008 (UTC)
Obsolete.fax - this is a chunk originally from your talk page, which you moved to Talk:Stochastic matrix. Now you have transcluded three talk page sections to a Reference Desk page. That's a very unusual thing to do, and is likely to confuse both yourself and others (it certainly confused me until I noticed the transclusion). It would have been better to say on the RD page "I have some questions about stochastic matrices" and then give a link to Talk:Stochastic matrix.
My explanation given back then on 30 Jan is essentially the same as Lambian's answer to your previous question above. What part of this explanation do you not understand ? Gandalf61 (talk) 11:29, 11 February 2008 (UTC)
What part of this explanation do you not understand? I do apologize if I am not following you. But where are you deriving the formulas from? And what is the purpose of the formulas which is described? Could you please explain: The formula for the probability distribution after k time steps is derived by mathematical induction. I don't understand the meaning from the articles I wikilinked. --Obsolete.fax (talk) 00:10, 12 February 2008 (UTC)
I am sorry if I am not able to grasp mathematics as easily as you, but I thank you for your consideration. Going a little back in the article: The random variable K gives the number of time steps the mouse stays in the game. Does that mean in this example we are trying to find the probability value of K? Or is it something bigger we are trying to solve in the example? --Obsolete.fax (talk) 12:09, 12 February 2008 (UTC)
K is a random variable that represents the number of time steps before the mouse is caught by the cat, with the starting state always being state 2. Don't confuse upper case K with lower case k, which is a variable representing any number of time steps. The value of K in any single game will depend on random decisions by the cat and the mouse on which way they jump. However, if we play the game many times (always starting in state 2) then we can find an average value of K across all of these games - this is the expected value of K. The article shows how to calculate the expected value of K from the starting state and the stochastic matrix of the game. Gandalf61 (talk) 08:59, 14 February 2008 (UTC)

## Stochastic matrix Question 4

$\mathbf{p}_k = \mathbf{v}P^k \, .$
We want to find the probability that the system is in a given state after a given number of time steps. The set of probabilities for each state after k time steps is given by the probability vector pk. The purpose of the formula is that it gives an expression for the probability vector after k time steps in terms of the initial state vector v and the stochastic matric P - so if we know v and P we can find the probability vector at any subsequent time. The "mathematical induction" part just means that we can derive the general formula for pk by looking at the formulae for p1, p2 etc. and then generalising the pattern that we see to k time steps. Can you see where the formulae that I give above for p1, p2 come from ? Can you see how they lead to a general formula for pk ? Gandalf61 (talk) 09:35, 12 February 2008 (UTC)

I am assuming the formulae $\mathbf{p}_k = \mathbf{v}P^k \,$ that you gave above for p1, p2 came from Summation? If this is true then can the formula be put in Sigma notation format $\sum_{i=m}^n x_i = x_m + x_{m+1} + x_{m+2} +\cdots+ x_{n-1} + x_n.$ ? --Obsolete.fax (talk) 05:28, 17 February 2008 (UTC)

I can't see any connection with summation at all. Gandalf61 (talk) 15:02, 21 February 2008 (UTC)
Then could you answer Can you see where the formulae that I give above for p1, p2 come from ? Can you see how they lead to a general formula for pk ? I don't know. Please help, would really appreciate. --Obsolete.fax (talk) 18:23, 23 February 2008 (UTC)
Then go back and carefully read the explanations that Lambian gave here and that I gave here. They show how the general formula for pk is derived. If there is a step that you don't understand, then say which step you get stuck on. It is not possible to help you any further unless you say which part of the explanation you don't understand. Gandalf61 (talk) 11:09, 24 February 2008 (UTC)

## Odd redirect?

Transition Matrix redirects here, whilst the page refers exclusively, barring the first paragraph, to stochastic matrices. Transition matrices in linear algebra, (at least can) refer to the a matrix which transforms any matrix from one basis-representation into another. I think this article should either include reference to this fact, or the redirect should be removed. Fish-Face (talk) 00:37, 22 January 2009 (UTC)

## Stationary probability vector

I'm quoting the article:

A stationary probability vector $\boldsymbol{\pi}$ is defined as a vector that does not change under application of the transition matrix; that is, it is defined as a left eigenvector of the probability matrix, associated with eigenvalue 1:

$\boldsymbol{\pi}P=\boldsymbol{\pi}$.

The Perron-Frobenius theorem ensures that such a vector exists, and that the largest eigenvalue associated with a stochastic matrix is always 1. For a matrix with strictly positive entries, this vector is unique. In general, however, there may be several such vectors.

This is slightly misstated. Trivially, a vector that does not change under application of the transition matrix exists (the zero vector). It is also very easy to see that a nonzero vector with this property exists. What actually uses the Perron-Frobenius theorem is the stronger assertion that a nonzero vector with this property and with all coordinates nonnegative exists. And probably even all coordinates positive, if the matrix is irreducible. - Darij (talk) 22:59, 18 March 2010 (UTC)

The whole section on the Perron Forbenius theorem is misstated. The conventional definition of steady state is that which is given by the limit, and the limit does not always exist. For example the matrix $\begin{bmatrix} 0 & 1 \\ 1 & 0\end{bmatrix}$ clearly does not have a steady state limit as it simply toggles states. By the definition given in the article $\begin{bmatrix} 1/2 & 1/2 \\ 1/2 & 1/2\end{bmatrix}$ would be the steady state, which is most likely not the most commonly used definition. Antares5245 (talk) 11:25, 6 April 2010 (UTC)
After more than two years, this section of the article is still extremely misleading. After reading it took me quite some time to look up and understand what does the theorem say and what does it mean for the different kinds of stochastic matrices (irreducible, aperiodic, positive...). Yet still I do not feel that I could correct the wording, if anybody is knowledgeable enough, please do so - it simply should not stay like this. Tomash (talk) 09:50, 20 December 2012 (UTC)

## Missing States?

There appears to be some missing states. The state diagram should look something like this:

 S1: (1,5) ↓↑ ↔ S2: (2,4) ↔ ↑↓ ↓ ↑↓ S3: (1,3) S4: (3,3) S5: (3,5) ↓ ↓ S6: (2,2) S7: (4,4)

for which the right transition matrix is:

$\begin{pmatrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 1/4 & 0 & 1/4 & 1/4 & 1/4 & 0 & 0 \\ 0 & 1/2 & 0 & 0 & 0 & 1/2 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 1/2 & 0 & 0 & 0 & 0 & 1/2 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix}$

For initial state (1,0,0,0,0,0,0), the repeated action of the transition matrix leads to a stationary state, (0,0,0,1/2,0,1/4,1/4). --Jbergquist (talk) 05:09, 11 January 2014 (UTC)

The three terminal states were merged into one which simplifies the problem. For computing survival time where the game end up at is unimportant.
 S1: (1,5) ↓↑ ↔ S2: (2,4) ↔ ↑↓ ↓ ↑↓ S3: (1,3) ↓ S4: (3,5) ↓ ↓ ↓ → S5: end ←
The states in the article are ordered by the first number and then the seconed number in the ordered pair (c,m) specifying cat position and mouse position so we get the sequence (1,3), (1,5), (2,4), (3,5) and "game over". Numbering the states by order of appearance as indicated above may make the article easier to follow and a state diagram would probable help too. --Jbergquist (talk) 09:24, 11 January 2014 (UTC)

## Definition

Shouldn't the notion for the probability of moving from i to j in one time step be $Pr(j|i)$ -- instead of $Pr(i|j)$ that is currently in the article.