lower bounds | Eventually Almost Everywhere

At our training and selection camp in Tonbridge in May, I gave a session about the use of probabilistic methods to solve olympiad combinatorics questions. Such an approach will normally be relevant to a problem where it is required to show the existence of a structure satisfying a particular property. We might consider constructing a structure randomly, and then try to show that the probability that our construction has the required property is non-zero.

There are several possible approaches to this, but often for the property we seek, we can describe a family of ‘bad events’ $A_1,\ldots,A_n$ , and then we have the property if none of the bad events hold. That is, we are looking for general conditions under which we can show that $\mathbb{P}(\Cap_{i=1}^n A_i^c)>0$ .

We have two cases where this is easy to do.

1) If all the $A_i$ s are independent, then so long as all $\mathbb{P}(A_i)<1$ , we have our result.

2) If the probability of the bad events have sum less than 1, then we can use a union bound

$\mathbb{P}(\cup A_i)\le \sum_{i=1}^n \mathbb{P}(A_i) <1,$

to conclude what we want.

In Tonbridge we also discussed first-moment methods, where we show that the expected number of bad events is less than 1, meaning there must be some elements of the probability space where the number of bad events is zero. In this article, we’re looking in a different direction. We’ll try to interpolate between situation 1) and 2), to make some concrete comment on the situation where the probabilities of the bad events are small, but not small enough to use the union bound, and where the events are not quite independent, but are in some sense almost independent.

The first thing we need to do is think about what it means for a family of events to be independent. Suppose we toss a fair coin twice, and consider the events:

A:= {first coin is H}, B:={second coin is T}, C:={exactly one H}.

So, it is easy to see that each pair of events is independent, but if we know that A and B hold, then also C holds (and indeed this stays true under cyclic re-ordering). So C is not independent of the family of events {A,B}. Rather than give a formal definition, I’ll say instead that an event B is said to be independent of the family of events $\{A_1,\ldots,A_5\}$ if it is independent of

– $A_3$

– $A_1\cap A_2$

– $A_3\cap A_4^c\cap A_5^c$ ,

and so on. I hope it’s clear from this what I mean. Slogan: no knowledge about which of the $A_i$ do or don’t hold gives information about B.

Now we return to our original setup. We want that each $A_i$ is independent of lots of the rest, and so we choose for each $i\in[n]$ a dependency set $D_i\subset [n]$ of indices, such that $A_i$ is independent of the family of events $\{A_j: j\in [n]\backslash D_i\}$ . It is tempting to interpret this immediately as saying that $A_i$ depends on each event with index in $D_i$ . This will normally be fine in practice, but we should emphasise that there is a lot of freedom to choose $D_i$ , and the genuinely important condition is that $A_i$ is independent of the family given by the rest of the indices.

[*Health warning*: The language I am using seems sensible for anyone learning about this for the first time, but is very slightly non-standard. Instead of dependency sets, the classical notation is to consider a dependency digraph on [n], with an edge i->j whenever $j\in D_i$ .]

The symmetric version of the Lovasz Local Lemma says: suppose $\mathbb{P}(A_i)\le p$ and we can choose $D_i$ as described so that $|D_i|\le d$ for each I. Then, if $epd\le 1$ , we may conclude $\mathbb{P}(\Cap_{i=1}^n A_i^c)>0$ .

We’ll come back to the proof, which is best seen in a slightly more general formulation. First, let’s get to grips with the notation, by reassuring ourselves that this really does lie between the union bound and the independence case.

If the events are independent, then clearly we may take $D_i=\{i\}$ for each i, that is d=1, so we may draw the desired conclusion so long as $p\le 1/e$ , which is fine, but a lot less convincing than p<1. Similarly, for the union bound, we have said nothing about the dependency relationships, and so we have to take $D_i=[n]$ for each i. So we may draw the desired conclusion provided $p\le 1/ne$ , which is obviously again a factor of e less than what we would have had with a union bound itself.

Now we’ll see how this might be useful when applied, for example, to a probabilistic construction for the lower bound on the Ramsey number R(k). From Erdos’s argument, we know $R(k)\ge (1+o(1)) 2^{k/2}$ , and we will earn an extra factor of $k\sqrt{2}/e$ . An extra factor of k/e can also be obtained by an alteration argument, but this extra factor of $\sqrt{2}$ makes this LLL method one of the strongest available for this problem.

Recall that for a lower bound, we are trying to find examples of 2-edge-colourings of a large complete graph $K_n$ , such that there is no monochromatic copy of a $K_k$ . We consider the uniform independent random edge-colouring. It makes sense that a bad event $A_S$ should be the presence of a monochromatic complete graph induced on a set $S\subset [n]$ , of size k. Since there are two potential colours for the monochromatic induced $K_k$ , we have $\mathbb{P}(A_S)=2^{1-\binom{k}{2}}$ . Then we take the dependency set $D_S$ of $A_S$ to include all those k-sets which share an edge with S, that is $|S\cap T|\ge 2$ . We think about which vertices might contribute to the shared edge, and which make up the remainder to conclude $|D_S|\le \binom{k}{2}\binom{n-2}{k-2}$ .

So now, whenever $e\cdot 2^{1-\binom{k}{2}}\binom{k}{2}\binom{n-2}{k-2}\le 1$ , as a consequence of LLL we can conclude that with positive probability the random colouring gives a suitable example, that is $R(k)>n$ . After fiddling around with Stirling’s formula in a fashion that isn’t hugely interesting, we can conclude $R(k)\ge (1+o(1)) \frac{k\sqrt{2}}{2} 2^{k/2}$ .

The prompt for this article was a discussion during our Malaysian training camp of methods applicable to IMO 2014 Q6. If you want to know just how applicable LLL is, I suggest you have a think yourself. It’s not a direct application – so some thought is involved. Maybe as an appetiser, here are some more elementary exercises, which I’ve borrowed from examples on Po-Shen Loh’s olympiad handouts, and Wikipedia, though I doubt the latter is the original source:

1) 11n points on a circle are coloured with n colours, such that each colour is used exactly 11 times. Show that it’s possible to choose one point of each colour such that no pair are adjacent.

2) A finite graph is given, and all degrees are between 50 and 100. Find some finite C such that you can always colour the vertices of such a graph so that the neighbourhood of any vertex includes at least 20 colours.

Finally, we discuss the more general statement of LLL, and explain how the proof works.

General Lovasz Local Lemma: Suppose there exist $x_i\in [0,1)$ such that $\mathbb{P}(A_i)\le x_i \prod_{j\in D_i\backslash\{i\}} (1-x_j)$ (*). Then $\mathbb{P}(\Cap A_i^c)\ge \prod (1-x_i)>0$ .

Deducing the symmetric form from the general form is not hard. Condition (*) is motivated by the proof. We want to be able to say that no matter which other events and their unions, complements etc we condition on, we still have an upper bound for the probability of $A_i$ . This bound will be $x_i$ . In particular, we want to show that the probability of bad event $A_i$ does not get too high, even if we know that lots of other bad events have not occurred.

The proof proceeds by showing $\mathbb{P}(A_i | \Cap_{j\in S}A_j^c)\le x_i$ for all i, by induction on $|S|$ . For the inductive step, you split $S=S_1\cup S_2$ where $S_1=S\cap D_i$ , $S_2=S\cap D_i^c$ . If $S_1=\varnothing$ , you are done straight away, by the assumption (*) and independence of $A_i$ and the events not indexed by $D_i$ . Otherwise, you can use the inductive hypothesis on $S_2$ , and repeated Bayes’ theorem to show what you want in a series of steps that have a lot of notation, but aren’t hugely difficult.