Independence and Association

Back when we did GCSE probability, we gave a definition of independent events as:

A and B are said to be independent if \mathbb{P}(A)\mathbb{P}(B)=\mathbb{P}(A\cap B).

We might also apply Bayes’ definition of conditional probability to say

\mathbb{P}(A|B)=\mathbb{P}(A)\quad\iff\quad A,B\text{ independent}\quad\iff\quad\mathbb{P}(B|A)=\mathbb{P}(B)

provided all the terms exist. (Eg the definition of \mathbb{P}(B|A) is at the very least non-obvious if the probability of A is 0.) In my opinion, this is a more naturally intuitive definition. For example, I think that when you toss two coins, the fact that the probability of the second coin being a tail is unaffected by whether the first is heads is more naturally ‘obvious’ than the fact that the joint probability of the two events is 1/4.

But, before getting too into anything philosophical, it is worth thinking about an equivalent situation for non-independent events. We remark that by an identical argument to above:

\mathbb{P}(A|B)\geq\mathbb{P}(A)\quad\iff\quad \mathbb{P}(A\cap B)\geq\mathbb{P}(A)\mathbb{P}(B)\quad\iff\quad\mathbb{P}(B|A)\geq\mathbb{P}(B)

Informally, this says that if we know A occurs, it increases the likelihood of B occuring. If we were talking about two random variables, we might say that they were positively correlated. But of course, by considering RVs 1_A,1_B, the result above is precisely the statement that the indicator functions have positive correlation.

Aim: To find a sufficient condition for positive correlation of random variables in a product measure.

Consider the following. Suppose A is an event which is positively correlated with the appearance of each edge. We might suspect that two such events A and B would be positively correlated. Instead, we consider a more concrete description. Recall that an event A is a subset of \Omega=\{0,1\}^E. Given w\in\Omega,e\in E, we say w^e\in\Omega defined by taking w and setting edge e to be open (note it may be open already). Now, we say event A is increasing, if

\forall w\in\Omega,\forall e\in E: w\in A\Rightarrow w^e\in A.

Note that this certainly implies the property previously mentioned, but the converse is not necessarily true.

Anyway, our revised aim will be to show that increasing events A and B are positively correlated for product measure.

For now, we approach the problem from the other direction, namely we attempt to find which measures on \{0,1\}^E have the property that A and B are positively correlated for all increasing A, B. Note that as before, we can think of this as \mathbb{E}1_A1_B\geq\mathbb{E}1_A\mathbb{E}1_B, and again here it is useful to rephrase our framework in terms of random variables. There is a natural (product) partial ordering of \Omega=\{0,1\}^E, and from this there is an easy notion of increasing random variables. Recall a random variable is defined as a measurable map \Omega\rightarrow\mathbb{R} so no further work is required.

X is increasing if w\geq w'\Rightarrow X(w)\geq X(w').

So we clarify our aim, which is to find a condition on the measure \mu such that \mu(XY)\geq \mu(X)\mu(Y) for all increasing X, Y. When this occurs, we say \mu is positively associated. Note that this is equivalent to \mu(A\cap B)\geq \mu(A)\mu(B) for all increasing events A, B. Why? We can build up X and Y from increasing indicator functions like \{X\geq x\} in a usual monotone class argument.

On the way, we need a partial ordering on the set of probability measures. Obviously, if \mu(A)\leq \nu(A) for all events A, then in fact \mu=\nu! So instead we say \mu\leq_{st}\nu if \mu(A)\leq \nu(A) for all increasing A. This is called the stochastic ordering, and there is a technical result of Strassen, proving the intuitively obvious claim that if \mu_1\leq \mu_2, then we can couple the measures in a natural way. Formally:

Theorem: \mu_1\leq\mu_2 \iff \exists a probability measure \nu on \Omega^2 such that the marginals are \mu_1,\mu_2 and

\nu(\{(w_1,w_2):w_1\leq w_2\})=1.

Our main result will be the FKG inequality which asserts that when \mu satisfies the following FKG lattice property

\mu(w_1\vee w_2)\mu(w_1\wedge w_2)\geq \mu(w_1)\mu(w_2),\quad\forall w_1,w_2\in\Omega

then \mu is positively associated. We will prove the case |E|<\infty.

We proceed by showing that \mu_1\leq\mu_2\propto Y\mu_1, rescaled, for Y an increasing RV. [Note that we are now suppressing the ‘st’ subscript, as context makes the use clear.]

To show this, we prove the more general Holley’s Theorem:

This states that if two positive probability measures satisfy a related lattice condition:

\mu_2(w_1\vee w_2)\mu_1(w_1\wedge w_2)\geq \mu_1(w_1)\mu_2(w_2)\quad\forall w_1,w_2\in\Omega

then we have the stochastic domination result: \mu_1\leq \mu_2.

Note that the lattice condition states, very informally, that adding edges results in a greater relative increase with respect to the measure \mu_2, which has a natural similarity to the definition of stochastic domination.

We prove this, perhaps unexpectedly, by resorting to a Markov chain. We note that there is a Markov chain on \Omega with equilibrium distribution given by \mu_1. This is simple: the non-zero transition rates are those given by the addition or removal of a single edge. Assume that edges are added at unit rate, and that edges are removed with rate: G(w^e,w_e)=\frac{\mu_1(w_e)}{\mu_1(w^e)}.

Similarly, we can construct a Markov chain on state space \Omega^2, where non-zero transitions are given by the addition of an edge to both states in the pair, the removal of an edge from both states in the pair, and the removal of an edge from only the first edge in the pair. Note that, as before, we may be ‘adding’ an edge which is already present. Assuming we start in this set, this choice means that we are restricting the sample space to \{(\pi,w):\pi\leq w\}. We need the transition rate of the third type of transition to have the form: \frac{\mu_1(\pi_e)}{\mu_1(\pi^e)}-\frac{\mu_2(w_e)}{\mu_2(w^e)}. So the lattice condition precisely confirms that this is non-negative, and thus we have a well-constructed Markov chain. The marginals have equilibrium distributions \mu_1,\mu_2 by construction, and by the general theory of Markov chains, there is an equilibrium distribution, and this leaves us in precisely the right position to apply Strassen to conclude the result.#

Summary of consequences: We have demonstrated that product measure is positively associated, as it certainly satisfies the FKG condition. Recall that this is what we had suspected intuitively for reasons given at the start of this account. Next time, I will talk about the most natural companion result, the BK inequality, and the stronger Reimer’s Inequality.

References: Both the motivation and the material is derived from Prof. Grimmett’s Part III course, Percolation and Related Topics, which was one of the mathematical highlights of the year. This account of the subject is a paraphrase of his lecture notes, which were themselves based on his book Probability on Graphs. Mistakes, naturally, are mine. Background on the course, and an online source of the book can be found on the course website here.

Advertisements

One thought on “Independence and Association

  1. Pingback: 100k Views | Eventually Almost Everywhere

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s