# Real Trees – Root Growth and Regrafting

Two weeks ago in our reading group meeting, Raphael told us about Chapter Five which introduces root growth and regrafting. One of the points of establishing the Gromov-Hausdorff topology in this book was to provide a more natural setting for a discussion of tree-valued processes. Indeed in what follows, one can imagine how to start the construction of a similar process for the excursions which can be used to encode real trees, involving cutting off sub-excursions above one-sided local minima, then glueing them back in elsewhere. But taking account of the equivalence structure will be challenging, and it is much nicer to be able to describe cutting a tree in two by removing a single point without having to worry about quotient maps.

We have seen in Chapter Two an example of a process defined on the family of rooted trees with n labelled vertices which has the uniform rooted tree as an invariant distribution. Given a rooted tree with root p, we choose uniformly at random a vertex p’ in [n] to be the new root. Then if p’=p we do nothing, otherwise we remove the unique first edge in the path from p’ to p, giving two trees. Adding an edge from p to p’ completes the step and gives a new tree with p’ as root. We might want to take a metric limit of these processes as n grows and see whether we end up with a stationary real tree-valued process whose marginals are the BCRT.

To see non-trivial limiting behaviour, it is most interesting to consider the evolution of a particular subtree (which includes the root) through this process. If the vertex chosen for cutting lies in our observed subtree, then the subtree undergoes a prune and regraft operation. On the other hand, if the vertex chosen for cutting does not lie in the subtree, then we do not see any effect of the pruning, except the addition of a new vertex below the original root, which becomes the new root. So essentially, from the point of view of our observed subtree, the root is growing.

Now we can think about interpreting the dynamics of a natural limit process acting on real trees. The key idea is that we don’t change the set on which the tree is supported much, but instead just change the metric. In particular, we will keep the original tree, and add on length at unit rate. Of course, where this length gets added on entirely determines the metric structure of the tree, but that doesn’t stop us giving a simple ‘name’ for the extra length. If we consider a process $X^T$ starting from a particular finite subtree T, then at time t, the tree $X^T_t$ is has vertex set $T \coprod (0,t]$. (Finite subtree here means that it has finite total length.)

Root regrafting should happen at a rate proportional to the total length of the current observed tree. This is reasonable since after all it is supported within a larger tree, so in the discrete case the probability of a prune-regrafting event happening within a given observed subtree is proportional to the number of vertices in that subtree, which scales naturally as length in the real tree limit. It turns out that to get unit rate root growth with $\Theta(1)$ rate prune-regrafting, we should consider subtrees of size $\sqrt{n}$ within a host tree of size n as $n\rightarrow\infty$. We also rescale the lengths by $\frac{1}{\sqrt{n}}$, and time by $\sqrt{n}$ so we actually see prune-regraft events.

Furthermore, if the subtree is pruned, the location of the pruning is chosen uniformly by length of the current observed subtree. So we can view the pruning process as being driven by a Poisson point process with intensity given by the instantaneous length measure of the tree, which at time t has vertex set $T\coprod (0,t]$. It will turn out to be consistent that there is a ‘piecewise isometry’ for want of a better phrase between the metric (and thus length measure) on $X^T_t$ and the canonical induced measure on $T\coprod (0,t]$, so we can describe the instances and locations of the pruning events via a pair of PPPs. The first is supported on $T \times [0,\infty)$, and the second on $\{(t,x): 0 \le x \le t\}$, since we only ‘notice’ pruning at the point labelled x if the pruning happens at some time t after x was created.

If we start from a compact tree T, then the total intensity of this pair is finite up to some time t, and so we have a countable sequence $\tau_0=0<\tau_1<\tau_2<\ldots$ of times for pruning events. It is easy to describe (but a bit messy to notate) the evolution of the metric between these pruning times. Essentially the distance between any pair of points in the observed tree at time $\tau_m$ with root $\rho_{\tau_m}$ is constant between times $\tau_m,\tau_{m+1}$, and new points are added so that the distance between $\rho_{\tau_m}$ and any new point $a\in(\tau_m,\tau_{m+1}]$ is $a-\tau_m$, and everything thing else follows from straightforward consideration of geodesics.

When a pruning event happens at point $x_m$ at time $\tau_m$, distances are preserved within the subtree above $x_m$ in $X^T_{\tau_m -}$, and within the rest of the tree. Again, an expression for the cross distances is straightforward but requires a volume of notation not ideally suited to this medium.

The natural thing to consider is the coupled processes started from different subtrees (again both must contain the original root) of the same host tree. Say $T^1,T^2\le T$, then it is relatively easy to check that $X^{T^1}_t,X^{T^2}_t \le X^T_t \,\forall t$, when we drive the processes by consistent coupled Poisson processes. Furthermore, it is genuinely obvious that the Hausdorff distance between $X^{T^1}_t,X^{T^2}_t$, here viewed as compact subsets of $(X^T_t, d^T_t)$ remains constant during root growth phase.

Less obvious but more important is that the Hausdorff distance decreases during regrafting events. Suppose that just before a regrafting event, the two subtrees are T’ and T”, and the Hausdorff distance between them is $\epsilon$. This Hausdorff distance is with respect to the metric on the whole tree T. [Actually this is a mild abuse of notation – I’m now taking T to be the whole tree just before the regraft, rather than the tree at time 0.]

So for any $a\in T'$, we can choose $b\in T''$ such that $d_T(a,b)\le \epsilon$. This is preserved under the regraft unless the pruning point lies on the geodesic segment (in T) between a and b. But in that case, the distance between a and the pruning point is again at most $\epsilon$, and so after the regrafting, a is at most $\epsilon$ away from the new root, which is in both subtrees, and in particular the regrafted version of T”.

This is obviously a useful first step on the path to proving any kind of convergence result. There are some technicalities which we have skipped over. It is fairly natural that this leads to a Markov process when the original tree is finite, but it is less clear how to define these dynamics when the total tree length is infinite, as we don’t want regrafting events to be happening continuously unless we can bound their net effect in some sense.

Last week, Franz showed us how to introduce the BCRT into matters. Specifically, that BCRT is the unique stationary distribution for this process. After a bit more work, the previous result says that for convergence properties it doesn’t matter too much what tree we start from, so it is fine to start from a single point. Then, the cut points and growth mechanism corresponds very well to the Poisson line-breaking construction of the BCRT. With another ‘grand coupling’ we can indeed construct them simultaneously. Furthermore, we can show weak convergence of the discrete-world Markov chain tree algorithm to the process with these RG with RG dynamics.

It does seem slightly counter-intuitive that a process defined on the whole of the discrete tree converges to a process defined through subtrees. Evans remarks in the introduction to the chapter that this is a consequence of having limits described as compact real trees. Then limitingly almost all vertices are close to leaves, so in a Hausdorff sense, considering only $\sqrt{n}$ of the vertices (ie a subtree) doesn’t really make any difference after rescaling edge lengths. I feel I don’t understand exactly why it’s ok to take the limits in this order, but I can see why this might work after more checking.

Tomorrow, we will have our last session, probably discussing subtree prune-and-regraft, where the regrafting does not necessarily happen at the root.