When is a Markov chain a Markov chain?

I’ve been taking tutorials on the third quarter of the second-year probability course, in which the student have met discrete-time Markov chains for the first time. The hardest aspect of this introduction (apart from the rapid pace – they cover only slightly less material than I did in Cambridge, but in half the time) is, in my opinion, choosing which definition of the Markov property is most appropriate to use in a given setting.

We have the wordy “conditional on the present, the future is independent of the past”, which is probably too vague for any precise application. Then you can ask more formally that the transition probabilities are the same under two types of conditioning, that is conditioning on the whole history, and conditioning on just the current value

\mathbb{P}(X_{n+1}=i_{n+1} \,\big|\, X_n=i_n,\ldots,X_0=i_0) = \mathbb{P}(X_{n+1}=i_{n+1} \,\big |\, X_n=i_n), (*)

and furthermore this must hold for all sets of values (i_{n+1},\ldots,i_0) and if we want time-homogeneity (as is usually assumed at least implicitly when we use the word ‘chain’), then these expressions should be functions of (i_n,i_{n+1}) but not n.

Alternatively, one can define everything in terms of the probability of seeing a given path:

\mathbb{P}(X_0=i_0,\ldots,X_n=i_n)= \lambda_{i_0}p_{i_0,i_1}\ldots p_{i_{n-1}i_n},

where \lambda is the initial distribution, and the p_{i,j}s are the entries of the transition matrix P.

Fortunately, these latter two definitions are equivalent, but it can be hard to know how to proceed when you’re asked to show that a given process is a Markov chain. I think this is partly because this is one of the rare examples of a concept that students meet, then immediately find it hard to think of any examples of similar processes which are not Markov chains. The only similar concept I can think of are vector spaces, which share this property mainly because almost everything in first-year mathematics is linear in some regard.

Non-examples of Markov chains

Anyway, during the tutorials I was asking for some suggestions of discrete-time processes on a countable or finite state space which are not Markov chains. Here are some things we came up with:

  • Consider a bag with a finite collection of marbles of various colours. Record the colours of marbles sampled repeatedly without replacement. Then the colour of the next marble depends on the set you’ve already seen, not on the current colour. And of course, the process terminates.
  • Non-backtracking random walk. Suppose you are on a graph where every vertex has degree at least 2, and in a step you move to an adjacent vertex, chosen uniformly among the neighbours, apart from the one from which you arrived.
  • In a more applied setting, it’s reasonable to assume that if we wanted to know the chance it will rain tomorrow, this will be informed by the weather over the past week (say) rather than just today.

Showing a process is a Markov chain

We often find Markov chains embedded in other processes, for example a sequence of IID random variables X_1,X_2,\ldots. Let’s consider the random walk S_n=\sum_{i=1}^n X_i, where each X_i =\pm 1 with probability p and (1-p). Define the running maximum M_n=\max_{m\le n}S_m, and then we are interested in Y_n:=M_n-S_n, which we claim is a Markov chain, and we will use this as an example for our recipe to show this in general.

We want to show (*) for the process Y_n. We start with the LHS of (*)

\mathbb{P}(Y_{n+1}=i_{n+1} \,\big|\, Y_n=i_n,\ldots,Y_0=i_0),

and then we rewrite Y_{n+1} as much as possible in terms of previous and current values of Y, and quantities which might be independent of previous values of Y. At this point it’s helpful to split into the cases i_n=0 and i_n\ne 0. We’ll treat the latter for now. Then


so we rewrite as

=\mathbb{P}(X_{n+1}=i_{n+1}-i_n \, \big |\, Y_n=i_n,\ldots, Y_0=i_0),

noting that we substitute i_n for Y_n since that’s in the conditioning. But this is now ideal, since X_{n+1} is actually independent of everything in the conditioning. So we could get rid of all the conditioning. But we don’t really want to do that, because we want to have conditioning on Y_n left. So let’s get rid of everything except that:

=\mathbb{P}(X_{n+1}=i_{n+1}-i_n\, \big |\, Y_n=i_n).

Now we can exactly reverse all of the other steps to get back to

= \mathbb{P}(Y_{n+1}=i_{n+1} \,\big|\, Y_n=i_n),

which is exactly what we required.

The key idea is that we stuck to the definition in terms of Y, and held all the conditioning in terms of Y, since that what actually determines the Markov property for Y, rearranging the event until it’s in terms of one of the underlying Xs, at which point it’s easy to use independence.

Showing a process is not a Markov chain

Let’s show that M_n is not a Markov chain. The classic mistake to make here is to talk about possible paths the random walk S could take, which is obviously relevant, but won’t give us a clear reason why M is not Markov. What we should instead do is suggest two paths taken by M, which have the same ‘current’ value, but induce transition probabilities, because they place different restrictions on the possible paths taken by S.


In both diagrams, the red line indicates a possible path taken by (M_0,M_1,\ldots,M_4), and the blue lines show possible paths of S which could induce these.

In the left diagram, clearly there’s only one such path that S could take, and so we know immediately what happens next. Either X_5=+1 (with probability p) in which case M_5=S_5=3, otherwise it’s -1, in which case M_5=2.

In the right diagram, there are two possibilities. In the case that S_4=0, clearly there’s no chance of the maximum increasing. So in the absence of other information, for M_5=3, we must have X_4=X_5=+1, and so the chance of this is p^2.

So although the same transitions are possible, they have different probabilities with different information about the history, and so the Markov property does not hold here.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s