This article was prompted by a question asked by one of my students about 5 minutes before the end of the final lecture of my Markov Chains course in Linyi. Although I didn’t have time to think about it right then, the flight home offered plenty of opportunity for having a play around. I am well aware that the proof given below is not the best one, but the ideas of minimising quadrants of a matrix seemed fairly natural. Anyway, it’s been sitting on my desktop for over two months now, so I decided I should put it up.
———
Recall that the transition probabilities of a finite Markov chain are determined by a stochastic matrix P. That is, each row of P is a probability distribution. In both theory and applications, the existence and properties of an invariant distribution is of interest. This is a probability distribution satisfying the relation:
(1)
It is clear that is a right- or column eigenvector of P, with eigenvalue 1. Since the spectrum of
is the same as that of P, we conclude that 1 is a left-eigenvalue of P also. So we can be assured of the existence of a vector
satisfying (1). What is unclear is that this eigenvector
should be a probability distribution. Since at least one entry must be non-zero, it will suffice to show that every entry of
is non-negative.
A necessary condition for the uniqueness of an invariant distribution is that the Markov chain be irreducible. This is best defined using the terminology of random walks: the chain is irreducible if for every pair of states , it is possible to move from i to j and back again. In terms of the transition matrix, P is irreducible if it is not block upper-triangular, up to reordering rows and columns.
We want to show that when P is irreducible, the (unique) 1-eigenvector is a probability distribution. The standard method proposed in this context is to exhibit the invariant distribution directly. For example, Norris’s Markov Chains defines
and shows that satisfies (1).
Nonetheless, the result required is clearly at least one step removed from the probabilistic interpretation, so it would be satisfying to find a direct proof of existence. Typically, one quotes the substantially more general theorem of Perron and Frobenius, the most relevant form of which is:
Theorem (Perron-Frobenius): Given A a non-negative and irreducible square matrix. Then there is a positive real eigenvalue with multiplicity 1 and such that all other eigenvalues have absolute value less than or equal to
. Then the (unique up to scaling) left- and right-eigenvectors corresponding to
are positive.
Here we present a short proof of this result in the case where A is the (stochastic) transition matrix of a Markov chain.
Proposition: An irreducible stochastic matrix has a 1-eigenvector with all entries non-negative.
Proof: We show instead the contrapositive: that if a stochastic matrix has a 1-eigenvector with both negative and non-negative components, then it is reducible. The motivation is this third eigenvector given in example (2). Observe that the communicating classes are and
, and it is not hard to see that for any eigenvector with negative and non-negative components, the sign of a component is a class property.
Informally, given an stochastic matrix P, and a 1-eigenvector
with this property, we relabel the states so that the non-negative components, which we call
are first. That is, in a moderate abuse of notation:
our aim is to show that the sub-matrices and
are both zero. This implies that states in A and states in B do not communicate, showing that P is reducible. We can formulate this as a linear programming problem:
It suffices to show that this maximum is 0. Now, we take , and assume that
, that is, there are a positive number of negative and non-negative components. Noting that the sum of the rows in a stochastic matrix is 1, we may consider instead:
and it will suffice to show that this minimum is n. To do this, we consider instead the dual:
The objective is certainly bounded by n. And in fact this is attainable, for example by taking:
Applying strong duality for linear programming problems completes the proof.
Related articles
- Understanding Linear Algebra Intuitively (stata.com)
- Simulation II: Markov Chains, Monte Carlo, and Markov Chain Monte Carlo (Introduction to Statistical Computing) (bactra.org)
- Markovian Excursions (eventuallyalmosteverywhere.wordpress.com)