As I explained in my previous post, I haven’t been reading around as much as I would generally like to recently. A few days in London staying with my parents and catching up with some friends has therefore been a good chance to get back into the habit of leafing through papers and Pitman’s book among other things.
This morning’s post should be a relatively short one. I’m going to define the contour process, a function of a (random or deterministic) tree, related to the exploration process which I have mentioned a few times previously. I will then use this to prove a simple but cute result equating in distribution the sizes of two different branching processes via a direct bijection.
The Contour Process
To start with, we have to have a root, and from that root we label the tree with a depth-first labelling. An example of this is given below. It is helpful at this stage to conceive this process as an explorer walking on the tree, and turning back on themselves only when there is no option to visit a vertex they haven’t already seen. So in the example tree shown, the depth-first exploration visits vertex V_2 exactly four times. Note that with this description, it is clear that the exploration traverses every edge exactly twice, and so the length of the sequence is 2n-1, where n is the number of vertices in the tree since obviously, we start and end at the root.
Another common interpretation of this depth-first exploration is to take some planar realisation of the tree. (Note trees are always planar – proof via induction after removing a leaf.) Then if you treat the tree as a hedge and starting at the root walk along, following the outer boundary with your right hand, this exactly recreates the process.
The height of a tree at a particular vertex is simply the graph distance between that vertex and the root. So when we move from one vertex to an adjacent vertex, the height must increase or decrease by 1.
The contour process is the sequence of heights seen along the depth-first exploration. It is therefore a sequence:
and such that .
Note that though the contour process uniquely determines the tree structure, the choice of depth-first labelling is a priori non-canonical. For example, in the display above, V_3 might have been explored before V_2. Normally this is resolved by taking the suitable vertex with the smallest label in the original tree to be next. It makes little difference to any analysis to choose the ordering of descendents of some vertex in a depth-first labelling randomly. Note that this explains why it is rather hard to recover Cayley’s theorem about the number of rooted trees on n vertices from this characterisation. Although the number of suitable contour functions is possible to calculate, we would require a complicated multiplicative correction for labelling if we wanted to recover the number of trees.
The only real observation about the uses of the contour process at this stage is that it is not in general a random walk with IID increments for a Galton-Watson branching process. This equivalence is what made the exploration process so useful. In particular, it made it straightforward, at least heuristically, to see why large trees might have a limit interpretation through Brownian excursions. If for example, the offspring distribution is bounded above, say by M, then the contour process certainly cannot be a random walk, as if we have visited a particular vertex exactly M+1 times, then it cannot have another descendent, and so we must return closer to the root at the next step.
I want to mention that in fact Aldous showed his results on scaling limits towards the Continuum Random Tree through the contour process rather than the exploration process. However, I don’t want to say any more about that right now.
A Neat Equivalence
What I do want to talk about is the following distribution on the positive integers. This comes up in Balazs Rath and Balint Toth’s work on forest-fires on the complete graph that I have been reading about recently. The role of this distribution is a conjectured equilibrium distribution for component size in a version of the Erdos-Renyi process where components are deleted (or ‘struck by lightning’) at a rate tuned so that giant components ‘just’ never emerge.
This distribution has the possibly useful property that it is the distribution of the total population size in a Galton-Watson process with Geom(1/2) offspring distribution. It is also the distribution of the total number of leaves in a critical binary branching process, where every vertex has either two descendents or zero descendents, each with probability 1/2. Note that both of these tree processes are critical, as the expected number of offspring is 1 in each case. This is a good start, as it suggests that the relevant equilibrium distribution should also have the power-law tail that is found in these critical branching processes. This would confirm that the forest-fire model exhibits self-organised criticality.
Anyway, as a sanity check, I tried to find a reason why, ignoring the forest-fires for now, these two distributions should be the same. One can argue using generating functions, but there is also the following nice bijective argument.
We focus first on the critical Geometric branching process. We examine its contour function. As explained above, the contour process is not in general a random walk with IID increments. However, for this particular case, it is. The geometric distribution should be viewed as the family of discrete memoryless distributions.
This is useful for the contour process. Note that if we are at vertex V for the (m+1)th time, that is we have already explored m of the edges out of V, then the probability that there is at least one further edge is 1/2, independently of the history of the exploration, as the offspring distribution is Geometric(1/2), which we can easily think of as adding edges one at a time based on independent fair coin tosses until we see a tail for example. The contour process for this random tree is therefore a simple symmetric random walk on Z. Note that this will hit -1 at some point, and the associated contour process is the RW up to the final time it hits 0 before hitting -1. We can check that this obeys the clear rule that with probability 1/2 the tree is a single vertex.
Now we consider the other model, the Galton-Watson process with critical binary branching mechanism. We should consider the exploration process. Recall that the increments in this process are given by the offspring distribution minus one. So this random sequence also behaves as a simple symmetric random walk on Z, again stopped when we hit -1.
To complete the bijective argument, we have to relate leaves in the binary process to vertices in the geometric one. A vertex is a leaf if it has no offspring, so the number of leaves is the number of times before the hitting time of -1 that the exploration process decreases by 1. (*)
Similarly for the contour process. Note that there is bijection between the set of vertices that aren’t the root and the set of edges. The contour process explores every edge exactly twice, once giving an increase of 1 and once giving a decrease of 1. So there is a bijection between the times that the contour process decreases by 1 and the non-root vertices. But the contour process was defined only up to the time we return to the root. This is fine if we know in advance how large the tree is, but we don’t know which return to the root is the final return to the root. So if we extend the random walk to the first time it hits -1, the portion up until the last increment is the contour process, and the final increment must be a decrease by 1, hence there is a bijection between the number of vertices in the Geom(1/2) G-W tree and the number of times that the contour process decreases by 1 before the hitting time of -1. Comparing with (*) gives the result.