This article was prompted by a question asked by one of my students about 5 minutes before the end of the final lecture of my Markov Chains course in Linyi. Although I didn’t have time to think about it right then, the flight home offered plenty of opportunity for having a play around. I am well aware that the proof given below is not the best one, but the ideas of minimising quadrants of a matrix seemed fairly natural. Anyway, it’s been sitting on my desktop for over two months now, so I decided I should put it up.

———

Recall that the transition probabilities of a finite Markov chain are determined by a *stochastic matrix* P. That is, each row of P is a probability distribution. In both theory and applications, the existence and properties of an *invariant distribution* is of interest. This is a probability distribution satisfying the relation:

(1)

It is clear that is a right- or column eigenvector of P, with eigenvalue 1. Since the spectrum of is the same as that of P, we conclude that 1 is a left-eigenvalue of P also. So we can be assured of the existence of a vector satisfying (1). What is unclear is that this eigenvector should be a probability distribution. Since at least one entry must be non-zero, it will suffice to show that every entry of is non-negative.

A necessary condition for the uniqueness of an invariant distribution is that the Markov chain be *irreducible*. This is best defined using the terminology of random walks: the chain is irreducible if for every pair of states , it is possible to move from *i* to *j* and back again. In terms of the transition matrix, P is irreducible if it is not block upper-triangular, up to reordering rows and columns.

We want to show that when P is irreducible, the (unique) 1-eigenvector is a probability distribution. The standard method proposed in this context is to exhibit the invariant distribution directly. For example, Norris’s *Markov Chains* defines

and shows that satisfies (1).

Nonetheless, the result required is clearly at least one step removed from the probabilistic interpretation, so it would be satisfying to find a direct proof of existence. Typically, one quotes the substantially more general theorem of Perron and Frobenius, the most relevant form of which is:

**Theorem (Perron-Frobenius):** Given A a non-negative and irreducible square matrix. Then there is a positive real eigenvalue with multiplicity 1 and such that all other eigenvalues have absolute value less than or equal to . Then the (unique up to scaling) left- and right-eigenvectors corresponding to are positive.

Here we present a short proof of this result in the case where A is the (stochastic) transition matrix of a Markov chain.

**Proposition:** An irreducible stochastic matrix has a 1-eigenvector with all entries non-negative.

*Proof:* We show instead the contrapositive: that if a stochastic matrix has a 1-eigenvector with both negative and non-negative components, then it is reducible. The motivation is this third eigenvector given in example (2). Observe that the communicating classes are and , and it is not hard to see that for any eigenvector with negative and non-negative components, the sign of a component is a class property.

Informally, given an stochastic matrix P, and a 1-eigenvector with this property, we relabel the states so that the non-negative components, which we call are first. That is, in a moderate abuse of notation:

our aim is to show that the sub-matrices and are both zero. This implies that states in *A* and states in *B* do not communicate, showing that *P* is reducible. We can formulate this as a linear programming problem:

It suffices to show that this maximum is 0. Now, we take , and assume that , that is, there are a positive number of negative and non-negative components. Noting that the sum of the rows in a stochastic matrix is 1, we may consider instead:

and it will suffice to show that this minimum is *n*. To do this, we consider instead the dual:

The objective is certainly bounded by *n*. And in fact this is attainable, for example by taking:

Applying strong duality for linear programming problems completes the proof.

###### Related articles

- Understanding Linear Algebra Intuitively (stata.com)
- Simulation II: Markov Chains, Monte Carlo, and Markov Chain Monte Carlo (Introduction to Statistical Computing) (bactra.org)
- Markovian Excursions (eventuallyalmosteverywhere.wordpress.com)

Pingback: Convergence of Transition Probabilities | Eventually Almost Everywhere

Pingback: Branching Random Walk and Amenability | Eventually Almost Everywhere

Pingback: Perturbation of Eigenvectors | Eventually Almost Everywhere

Pingback: Lecture 9 – Inhomogeneous random graphs | Eventually Almost Everywhere