I am aiming to write a short post about each lecture in my ongoing course on Random Graphs. Details and logistics for the course can be found here.
This lecture consisted of revision of the most relevant theory of Galton-Watson trees, with a focus on the case where the offspring distribution is Poisson, since, as we have seen in previous lectures, this is a strong candidate to approximate the structure of . It makes sense to cover the theory of the trees before attempting to make rigorous the sense of approximation.
Given a Galton-Watson tree T, it is natural to label the vertices in a breadth-first order as . This is easiest if we have constructed the Galton-Watson tree as a subset of the infinite Ulam-Harris tree, where vertices have labels like (3,5,17,4), whose parent is (3,5,17). If this child vertex is part of the tree, then so are (3,5,17,1), (3,5,17,2), and (3,5,17,3). This means our breadth-first order is canonically well-defined, as we have a natural ordering of the children of each parent vertex.
Note: one advantage of using breadth-first order rather than depth-first order (which corresponds to the usual dictionary, or lexicographic ordering of the labels) is that if the tree is infinite, we don’t explore all of it during a depth-first search. (In the sense that there exist vertices which are never given a finite label.) For breadth-first search, a similar problem arises precisely when some vertex has infinitely many children. For a conventional Galton-Watson tree, the latter situation is much less of a problem than the infinite total population problem, which happens with positive probability whenever .
Anyway, given the depth-first order, one can consider an exploration process given by
where is the number of children of . In this way, we see that
records the number of vertices in some stack containing those which we have ‘seen but not explored’. Some authors prefer to start from 0, in which case one ends up with a similar but slightly different interpretation of the ‘stack’, but that’s fine since we aren’t going to define formally what ‘seen’ and ‘explored’ means in this post.
Essentially, we exhaust the vertices of the tree whenever , and so the condition that requires
Conveniently, so long as we have avoiding ordering ambiguity, for example by insisting that trees live within the Ulam-Harris tree, we can reconstruct T uniquely from .
Furthermore, if T is a Galton-Watson process, then the numbers of children are IID, and so in fact this exploration process is a random walk, and the size of the tree can be recovered as the hitting time of zero.
Note: making fully rigorous the argument that children in the GW tree are independent of the breadth-first walk fully rigorous is somewhat technical, and not to be dismissed lightly, though not of principle interest at the level of this topics course. See Proposition 1.5 in Section 1.2 of Le Gall’s notes or Section 1.2.2 of my doctoral thesis for further discussion and argument.
The hitting time theorem allows us to study the distribution of the hitting time of a random walk whose increments are bounded below by -1, in terms of the distribution of the value of the random walk.
Theorem: Let be a random walk with and IID increments satisfying . Let be the hitting time of -k.
Commentary: there are local central limit theorem estimates and large deviation estimates that allow good control of the probability on the RHS for a rich class of contexts. So at a meta-level, the hitting time theorem allows us to reduce a complicated (though still classical) problem, to a real classical problem, which is particularly helpful when the LHS is a device for capturing relevant information about our random tree model.
Proof context/history: The case k=1, as relevant to the trees, is often referred to as Dwass’s theorem, though the combinatorial argument pre-dates this, sometimes in the form of the similar ballot theorem. A complete reference list would be challenging, but it certainly appears in Feller, due to Spitzer. In his recent book, van der Hofstad uses an argument by induction on n, noting that the statement is obvious when n=0,1,…,k-1, and obvious for different reasons when n=k. This argument can also be found published here.
I’m going to give a combinatorial argument, similar to one I’ve given in a post from years ago (back then, for the case k=1), because there wasn’t time to include this in the lecture.
Proof: from the point of view of the course, the most important probabilistic take home message from this argument is that we get significant control by introducing the randomness in two stages, and conditioning on the first round of randomness.
Here, what we do is to condition on the increments in cyclic order, ie without distinguishing between and . We’ll declare which of these is the true ordering at the end. Anyway, the punchline is the following lemma.
Lemma: when , and all , then exactly k of the n orderings corresponding to the cyclic ordering have the property that .
With this lemma, it’s clear why the hitting time theorem holds. We condition on . If the sum of these elements is not -k, then there’s no chance that the hitting time is n. If the sum is -k, then when we choose a true ordering at random, we get the hitting time as n with conditional probability k/n.
Proof of lemma: At all times, take indices modulo n. We want labels satisfying:
Step 1: Suppose there are satisfying (*). It’s helpful to think about going round the circle of values. If we go all the way round the circle, we get -k, but if we go from then we have an initial segment from , and so
However, if we repeat this for the k+1 such blocks, we end up going all the way round the circle, and get
which is a contradiction since the total sum is k. So we have at most k such values of .
Step 2: Now we show that there is at least one such . Let
and if there are multiple such m, let be the minimal such. Now set , and we will show this satisfies the requirements.
- Then , whenever .
- And when , we have
And by assumption, the first bracket is strictly greater than the second bracket, so
Step 3: Now we have an which satisfies, we may assume WLOG that it is . Now we construct the hitting times:
We claim that all these also satisfy the requirements for . This is fairly clear from a diagram, or from the fact that when , then