Lecture 7 – The giant component

I am aiming to write a short post about each lecture in my ongoing course on Random Graphs. Details and logistics for the course can be found here.

As we edge into the second half of the course, we are now in a position to return to the question of the phase transition between the subcritical regime \lambda<1 and the supercritical regime \lambda>1 concerning the size of the largest component L_1(G(n,\lambda/n)).

In Lecture 3, we used the exploration process to give upper bounds on the size of this largest component in the subcritical regime. In particular, we showed that

\frac{1}{n}\big| L_1(G(n,\lambda/n)) \big| \stackrel{\mathbb{P}}\rightarrow 0.

If we used slightly stronger random walk concentration estimates (Chernoff bounds rather than 2nd-moment bounds from Chebyshev’s inequality), we could in fact have shown that with high probability the size of this largest component was at most some logarithmic function of n.

In this lecture, we turn to the supercritical regime. In the previous lecture, we defined various forms of weak local limit, and asserted (without attempting the notationally-involved combinatorial calculation) that the random graph G(n,\lambda/n) converges locally weakly in probability to the Galton-Watson tree with \text{Poisson}(\lambda) offspring distribution, as we’ve used informally earlier in the course.

Of course, when \lambda>1, this branching process has strictly positive survival probability \zeta_\lambda>0. At a heuristic level, we imagine that all vertices whose local neighbourhood is ‘infinite’ are in fact part of the same giant component, which should occupy (\zeta_\lambda+o_{\mathbb{P}}(1))n vertices. In its most basic form, the result is

\frac{1}{n}\big|L_1(G(n,\lambda/n))\big|\;\stackrel{\mathbb{P}}\longrightarrow\; \zeta_\lambda,\quad \frac{1}{n}\big|L_2(G(n,\lambda/n))\big| \;\stackrel{\mathbb{P}}\longrightarrow\; 0, (*)

where the second part is a uniqueness result for the giant component.

The usual heuristic for proving this result is that all ‘large’ components must in fact be joined. For example, if there are two giant components, with sizes \approx \alpha n,\approx \beta n, then each time we add a new edge (such an argument is often called ‘sprinkling‘), the probability that these two components are joined is \approx 2ab, and so if we add lots of edges (which happens as we move from edge probability \lambda-\epsilon\mapsto \lambda ) then with high probability these two components get joined.

It is hard to make this argument rigorous, and the normal approach is to show that with high probability there are no components with sizes within a certain intermediate range (say between \Theta(\log n) and n^\alpha) and then show that all larger components are the same by a joint exploration process or a technical sprinkling argument. Cf the books of Bollobas and of Janson, Luczak, Rucinski. See also this blog post (and the next page) for a readable online version of this argument.

I can’t find any version of the following argument, which takes the weak local convergence as an assumption, in the literature, but seems appropriate to this course. It is worth noting that, as we shall see, the method is not hugely robust to adjustments in case one is, for example, seeking stronger estimates on the giant component (eg a CLT).

Anyway, we proceed in three steps:

Step 1: First we show, using the local limit, that for any \epsilon>0,

\frac{1}{n}\big|L_1(G(n,\lambda/n))\big| \le \zeta_\lambda+\epsilon, with high probability as n\rightarrow\infty.

Step 2: Using a lower bound on the exploration process, for \epsilon>0 small enough

\frac{1}{n}\big|L_1(G(n,\lambda/n))\big| \ge \epsilon, with high probability.

Step 3: Motivated by duality, we count isolated vertices to show

\mathbb{P}(\epsilon n\le |L_1| \le (\zeta_\lambda-\epsilon)n) \rightarrow 0.

We will return to uniqueness at the end.

Step 1

This step is unsurprising. The local limit gives control on how many vertices are in small components of various sizes, and so gives control on how many vertices are in small components of all finite sizes (taking limits in the right order). This gives a bound on how many vertices can be in the giant component. Continue reading


Lecture 6 – Local limits

I am aiming to write a short post about each lecture in my ongoing course on Random Graphs. Details and logistics for the course can be found here.

By this point of the course, we’ve studied several aspects of the Erdos-Renyi random graph, especially in the sparse setting G(n,\frac{\lambda}{n}). We’ve also taken a lengthy detour to revise Galton-Watson trees, with a particular focus on the case of Poisson offspring distribution.

This is deliberate. Note that a given vertex v of G(n,\frac{\lambda}{n}) has some number of neighbours distributed as \mathrm{Bin}(n-1,\frac{\lambda}{n})\stackrel{d}\approx\mathrm{Po}(\lambda), and the same approximation remains valid as we explore the graph (for example in a breadth-first fashion) either until we have seen a large number of vertices, or unless some ultra-pathological event happens, such as a vertex having degree n/3.

In any case, we are motivated by the notion that the local structure of G(n,\frac{\lambda}{n}) is well-approximated by the Galton-Watson tree with \mathrm{Po}(\lambda) offspring, and in this lecture and the next we try to make this notion precise, and discuss some consequences when we can show that this form of convergence occurs.

Deterministic graphs

Throughout, we will be interested in rooted graphs, since by definition we have to choose a root vertex whose local neighbourhood is to be studied. Usually, we will study a sequence of rooted graphs (G_n,\rho_n), where the vertex set of G_n is [n], or certainly increasing in n (as in the first example).

For some rooted graph (G,\rho), we say such a sequence (G_n,\rho_n) converges to (G,\rho) locally if for all radii r\ge 1, we have B_r^{G_n}(\rho_n)\simeq B_r^G(\rho). In words, the neighbourhood around \rho_n in G_n is the same up to radius r as the neighbourhood around \rho in G, so long as n is large enough (for given r).

This is best illustrated by an example, such as T_n, the binary tree to depth n.

If we take \rho_n to be the usual root, then the trees are nested, and converge locally to the infinite binary tree T_\infty. Slightly less obviously, if we take \rho_n to be one of the leaves, then the trees are still nested (up to labelling – ie in the sense of isomorphisms of rooted trees), and converge locally to the canopy tree, defined by a copy of \mathbb{Z}_{\ge 0} with nearest-neighbour edges, and where each vertex n\ge 1 is connected to the root of a disjoint copy of T_{n-1}, as shown below:

Things get more interesting when the root is chosen randomly, for example, uniformly at random, as this encodes more global information about the graphs G_n. In the case where the G_n are vertex-transitive, then if we only care about rooted graphs up to isomorphism, then it doesn’t matter how we choose the root.

Otherwise, we say that G_n converges in the local weak sense to (G,\rho) if, for all r\ge 1 and for all rooted graphs (H,\rho_H),

\mathbb{P}\left( B^{G_n}_r(\rho_n)\simeq (H,\rho_H) \right) \longrightarrow \mathbb{P}\left( B_r^G(\rho)\simeq H\right),

as n\rightarrow\infty.

Alternatively, one can phrase this as a result about convergence of rooted-graph-valued distributions.

A simple non-transitive example is G_n\simeq P_n, the path of length n. Then, the r-neighbourhood of a vertex is isomorphic to P_{2r}unless that vertex is within graph-distance (r-1) of one of the leaves of G_n. As n\rightarrow\infty, the proportion of such vertices vanishes, and so, \mathbb{P}\left( B^{P_n}_r(\rho_n)\simeq P_{2r}\right)\rightarrow 1, from which we conclude the unsurprising result that P_{n} converges in the local weak sense to \mathbb{Z}. (Which is vertex-transitive, so it doesn’t matter where we select the root.)

The binary trees offer a slightly richer perspective. Let \mathcal{L}_n be the set of leaves of T_n, and we claim that when \rho_n is chosen uniformly from the vertices of T_n, then d_{T_n}(\rho_n,\mathcal{L}_n) converges in distribution. Indeed, \mathbb{P}\left( d_{T_n}(\rho_n,\mathcal{L}_n)=k\right) = \frac{2^{n-k}}{2^{n+1}-1}, whenever n\ge k, and so the given distance converges in distribution to the Geometric distribution with parameter 1/2 supported on {0,1,2,…}.

This induces a random local weak limit, namely the canopy tree, rooted at one of the vertices we denoted by \mathbb{Z}_{\ge 0}, with the choice of this vertex given by Geometric(1/2). Continue reading

Lecture 4 – Hitting time theorem

I am aiming to write a short post about each lecture in my ongoing course on Random Graphs. Details and logistics for the course can be found here.

This lecture consisted of revision of the most relevant theory of Galton-Watson trees, with a focus on the case where the offspring distribution is Poisson, since, as we have seen in previous lectures, this is a strong candidate to approximate the structure of G(n,\lambda/n). It makes sense to cover the theory of the trees before attempting to make rigorous the sense of approximation.

Given a Galton-Watson tree T, it is natural to label the vertices in a breadth-first order as \varnothing=v_1,v_2,\ldots,v_{|T|}. This is easiest if we have constructed the Galton-Watson tree as a subset of the infinite Ulam-Harris tree, where vertices have labels like (3,5,17,4), whose parent is (3,5,17). If this child vertex is part of the tree, then so are (3,5,17,1), (3,5,17,2), and (3,5,17,3). This means our breadth-first order is canonically well-defined, as we have a natural ordering of the children of each parent vertex.

Note: one advantage of using breadth-first order rather than depth-first order (which corresponds to the usual dictionary, or lexicographic ordering of the labels) is that if the tree is infinite, we don’t explore all of it during a depth-first search. (In the sense that there exist vertices which are never given a finite label.) For breadth-first search, a similar problem arises precisely when some vertex has infinitely many children. For a conventional Galton-Watson tree, the latter situation is much less of a problem than the infinite total population problem, which happens with positive probability whenever \mu=\mathbb{E}[X]>1.

Anyway, given the depth-first order, one can consider an exploration process S_0,S_1,S_2,\ldots,S_{|T|} given by

S_0=1,\quad S_i=S_{i-1}+(X_i-1),

where X_i is the number of children of v_i. In this way, we see that

S_i=\big| \Gamma(v_1)\cup\ldots\cup\Gamma(v_i)\backslash \{v_1,\ldots,v_i\}\big|,\quad i\ge 1,

records the number of vertices in some stack containing those which we have ‘seen but not explored’. Some authors prefer to start from 0, in which case one ends up with a similar but slightly different interpretation of the ‘stack’, but that’s fine since we aren’t going to define formally what ‘seen’ and ‘explored’ means in this post.

Essentially, we exhaust the vertices of the tree whenever S_t=0, and so the condition that |T|=n requires

S_n=0,\quad S_m\ge 1,\; m=0,1,\ldots,n-1.

Conveniently, so long as we have avoiding ordering ambiguity, for example by insisting that trees live within the Ulam-Harris tree, we can reconstruct T uniquely from (S_0,S_1,\ldots,S_{|T|}).

Furthermore, if T is a Galton-Watson process, then the numbers of children X_i are IID, and so in fact this exploration process is a random walk, and the size of the tree can be recovered as the hitting time of zero.

Note: making fully rigorous the argument that children in the GW tree are independent of the breadth-first walk fully rigorous is somewhat technical, and not to be dismissed lightly, though not of principle interest at the level of this topics course. See Proposition 1.5 in Section 1.2 of Le Gall’s notes or Section 1.2.2 of my doctoral thesis for further discussion and argument.

The hitting time theorem allows us to study the distribution of the hitting time of a random walk whose increments are bounded below by -1, in terms of the distribution of the value of the random walk.

Theorem: Let (S_n,\, n\ge 0) be a random walk with S_0=0 and IID increments (X_n,n\ge 1) satisfying \mathbb{P}(X_n\ge -1)=1. Let H_{-k}=\inf \left\{n\,:\, S_n=-k\right\} be the hitting time of -k.

Then \mathbb{P}\big( H_{-k}=n\big) = \frac{k}{n}\mathbb{P}\big(S_n=-k).

Commentary: there are local central limit theorem estimates and large deviation estimates that allow good control of the probability on the RHS for a rich class of contexts. So at a meta-level, the hitting time theorem allows us to reduce a complicated (though still classical) problem, to a real classical problem, which is particularly helpful when the LHS is a device for capturing relevant information about our random tree model.

Continue reading

BMO1 2018

The first round of the British Mathematical Olympiad was sat yesterday. The paper can be found here, and video solutions here. Copyright for the questions is held by BMOS. They are reproduced here with permission.

I hope any students who sat the paper enjoyed at least some of the questions, and found it challenging! The following commentaries on the problems are not official solutions, and are not always full solutions at all, but contain significant steps of solutions, so would be best saved until after you have attempted the problems, if you are planning to do so. I’ve written quite a lot about Q5 because I found it hard (or at least time-consuming) and somewhat atypical, and I’ve written a lot about Q6 because there was a lot to say. I hope at least some of this is interesting to some readers of all levels of olympiad experience.

Question 1

A list of five two-digit positive integers is written in increasing order on a blackboard. Each of the five integers is a multiple of 3, and each digit {0,1,…,9} appears exactly once on the blackboard. In how many ways can this be done? (Note that a two-digit number cannot begin with zero.)

It’s a trope of BMO1 that the first question must be doable by some sort of exhaustive calculation or listing exercise. Of course, that is rarely the most efficient solution.

However, there is normally a trade-off between eliminating all listing, and reducing to a manageable task.

The key observation here is that writing the integers in increasing order is really just a way to indicate that order of the choices doesn’t matter. Even if that seems counter-intuitive. The question wants to know how many ways to choose these five numbers. The order of choice doesn’t matter since we’re going to put them in ascending order on the blackboard anyway.

You want to make your choices with as much independence as possible. So it would, for example, be a bad idea to choose the smallest number first. How many possibilities are there where the smallest number is 24? What about 42? What about 69? These are all different, and some are zero, so will make the computation very taxing.

However, you might notice that the digits {0,3,6,9} have to go together to form two numbers, and the rest have to pair up with one digit from {1,4,7} and one from {2,5,8}. You might know that an integer is divisible by 3 precisely if its digit sum is divisible by 3, but in this context you wouldn’t lose too much time by simply listing everything! These tasks are now completely separate, so you can take the number of ways to pair up {0,3,6,9} and multiply by the number of ways to pair up {1,4,7} and {2,5,8}. You need to take care over the ordering. It does (obviously) matter which is the first digit and which is the second digit in a number!

Continue reading