Gromov-Hausdorff Distance on Trees

This post continues the exposition of Gromov-Hausdorff distance, as introduced in Chapter Four of Steve Evans’ Probability and Real Trees, which we are reading as a group in the Stats Dept in Oxford at the moment. In this post, we consider applications of the Gromov-Hausdorff distance we have just introduced in the context of trees, viewed as metric spaces.

First we consider a direct application of the previous result, which related the Gromov-Hausdorff distance to the infimum of the distortion across the family of correspondences between the two relevant metric spaces. I found this as Corollary 3.7 in notes by Le Gall and Miermont on scaling limits of random trees and maps, which can be found here. I’m not clear whether there is an original source, but the result is simple enough that probably it does not matter hugely.

Proposition – Given f,g excursions above [0,1], and T_f, T_g the real trees associated with these excursions in the standard way, then

d_{GH}(T_f,T_g)\le 2 ||f-g||.

Proof: We construct an appropriate correspondence

\mathcal {R}=\left\{(a,b) \,:\, \exists t\in[0,1]\text{ s.t. } a=p_f(t), b=p_g(t)\right\}.

In other words, the trees are defined as projections from [0.1] (with some equivalence structure), so taking pairs of projections from [0,1] gives a natural correspondence. Now, suppose (p_f(s),p_g(s)), (p_f(t),p_g(t))\in \mathcal {R}. Then

d_{T_f}(p_f(s),p_f(t)) = f(s) + f(t) - 2\hat f(s,t),

where \hat f(s,t):= \min{r\in[s,t]}f(r). Obviously, we can replace f by g to obtain

d_{T_g}(p_g(s),p_g(t)) = g(s) + g(t) - 2\hat g(s,t).

By thinking slightly carefully about where the functions achieve their minima compared with where the minimum in the sup norm is achieved, we can conclude that

|d_{T_f}(p_f(s),p_f(t)) - d_{T_g}(p_g(s),p_g(t)) | \le 4||f-g||,

and so the result follows from the relation between Gromov-Hausdorff distance and the infimum of distortions over the set of correspondences proved at the end of the previous post.

Gromov-Hausdorff Limits of Real Trees

If we are going to consider the Gromov-Hausdorff topology for limits of tree, we want to be sure that the limit of a sequence of real trees is another real tree. In particular, we want to show that this convergence preserves the property of being a geodesic space.

Theorem 4.19 – Given X_n geodesic spaces, and X complete, such that X_n \stackrel{d_{GH}}\rightarrow X, then X is geodesic.

Proof: Here, I’ll go in the opposite order to Evans’ book, as I think it’s easier understand why a special case of this implies the whole result once you’ve actually shown that special case.

Anyway, we want to show that given x,y\in X, there is a geodesic segment x\leftrightarrow y in X. We will start by showing that there is a well-defined midpoint of the geodesic, that is a point z \in X such that d(x,z)=d(y,z)=\frac12 d(x,y).

Given arbitrary \epsilon>0, we can take n such that d_{GH}(X,X_n) < \frac{\epsilon}{3} and then a correspondence \mathcal{R} between X,X_n with \mathrm{dis}\mathcal {R}<\frac{2\epsilon}{3}. Now, by definition of correspondence, we have x',y'\in X_n such that (x,x'),(y,y')\in\mathcal{R}. But we do have geodesics in X_n, so we can take z'\in X_n the midpoint of geodesic x'\leftrightarrow y'. Predictably, we now take z\in X such that (z,z')\in\mathcal {R}.

We can show that z is ‘almost’ the midpoint of [x,y] in the sense that

|d(x,z) - \frac12 d(x,y)| \le |d(x',z')-\frac12 d(x',y')+\frac32 \mathrm{dis}\mathcal{R} \le \epsilon.

Similarly, we have |d(y,z)-\frac12 d(x,y)|\le \epsilon.

Perhaps it’s helpful to think of this point z that we’ve constructed as being like a taut, but not quite rigid string. The midpoint of the string has to be fairly near the midpoint of the endpoints, and in particular, as we let \epsilon\rightarrow 0, the z’s we deal with form a Cauchy sequence, and thus converge to some point, which (in a case of poor notation planning) we also call z \in X, which is the midpoint of [x,y] as described before.

We can now iterate this iteration, to demonstrate that whenever q is a dyadic rational in [0,1], there exists z_q \in X such that

d(x,z_q)=q d(x,y),\quad d(y,z_q) = (1-q)d(x,y).

Again then, if we want the above to hold for some general real r in [0,1], we can approximate r arbitrarily well by dyadic rationals q, and the associated points z_q are Cauchy and thus have a well-defined limit with the required properties. We thus have our geodesic segment x\leftrightarrow y.

Rooted Gromov-Hausdorff Distance

In the end, the trees we want to compare might be rooted. For example, we talk about finite trees being invariant under random re-rooting, and we might be interested in similar results for real trees, in particular the BCRT. So we need to compare metric spaces as viewed outwards from particular identified points of each space.

An isometry of rooted spaces must map the root to the root, and so we adjust to obtain rooted Gromov-Hausdorff distance. We might try to consider embeddings into a common space so that the roots are shared, but it will be more convenient to maintain the infimum over metrics on the disjoint union as before. But to ensure the roots are in roughly the ‘same’ place in both set embeddings, we minimise the maximum of the Hausdorff distance between the sets, and the distance between the images of the roots in the common covering space.

Similar results apply as in the unrooted case, and normally the proofs are very similar. As we might expect, when defining a correspondence between rooted spaces, we demand that the pair of roots is one of the roots in the correspondence, and then the same equivalence between the minimal distortion and the rooted G-H distance applies.

Evans shows that \mathbb{T}^{\mathrm{root}}, the set of compact rooted trees is complete and separable under the rooted G-H topology. Separability is relatively easy to see. Compact trees have finite \epsilon-nets, and there is a canonical way to view the net as the vertices of a finite tree with edge lengths. Approximating these edge lengths by rationals and consider the countable family of isomorphism classes of rooted finite trees gives separability.

If we want, we can also define k-pointed Gromov-Hausdorff distance, where we demand k points in each space are held fixed.

Tree \eta-erasure

To show that this is a natural topology to consider for the family of trees, Evans devotes a short section to the operation of \eta-erasure, where all points within \eta of a leaf are removed from a given tree.

Formally, R_\eta is a map \mathbb{T}^{\mathrm{root}}\rightarrow \mathbb{T}^{\mathrm{root}} (recalling that these are compact real trees), so that R_\eta(T) consists of the tree

\{\rho\}\cup\{a \in T \, : \, \exists x\in T,\, a\in[\rho, x],d_T(a,x)\ge \eta\},

rooted again at \rho. We claim that the range of R_\eta is the set of compact rooted trees with a finite number of leaves. In this setting, we want a geodesic definition of leaf, for example a leaf is a point that doesn’t lie in the interior of the (unique) geodesic segment between any other point and the root.

Given a tree T with a finite number of leaves, we can glue disjoint segments of length \eta onto every leaf. Taking R_\eta of this deeper tree will give T. Similarly, suppose R_\eta(T) has infinitely many leaves, which we can label a_1,a_2,\ldots. Thus we also have x_1,x_2,\ldots \in T such that a_i\in[\rho,x_i], and the segments \{[a_i,x_i]\} are disjoint, and all have length at least \eta, hence T cannot be compact, as it has no finite \frac{\eta}{2} net.

It is clear that for fixed T, the family of these maps applied to T is continuous with respect to \eta in the G-H topology. When \eta is changed a small amount, the amount of extra tree removed is locally small, and so approximating correspondences by points in what is left is absolutely fine. Indeed, the operations T_{\eta_1}, T_{\eta_2} commute to give T_{\eta_1+\eta_2}.

We want to show that for fixed \eta this map R_\eta is continuous with respect to G-H topology. Suppose two compact rooted trees S,T are covered by a rooted correspondence \mathcal R with distortion \epsilon \ll \eta. We can’t immediately restrict \mathcal R to R_\eta(S)\times R_\eta(T), as it won’t necessarily be a surjection under the projection maps any more.

But we can get around this. Note that if (x,y)\in\mathcal{R}, then |d_S(\rho_S,x) - d_T(\rho_T,y)| \le \epsilon by assumption. We will construct a correspondence \mathcal R' on the erased trees as follows. Given (x,y)\in\mathcal{R}, if (x,y)\in R_\eta(S)\times R_\eta(T), we keep it, and if x\not\in R_\eta(S), y\not \in R_\eta(y), we throw it away. Suppose we have (s,t) in \mathcal{R} with s\in R_\eta(S), t\not \in R_\eta(T). Let l_s be a leaf of S such that s \in [\rho_S,l_s], and let \bar t be the farthest point from the root of the geodesic [\rho_T,t] restricted to R_\eta(T). So the tree above \bar t includes t and has height at most \eta.

In words, if a point appears in a pair in the correspondence but is removed by the erasure, we replace it in the pair with the point closest to the original point that was not removed by the erasure.

It remains to prove that this works. My original proof was short but false, and its replacement is long (and I hope true now), but will postpone writing this down either until another post, or indefinitely. The original proof by Evans and co-authors can be found as the main content of Lemma 2.6 in the original paper [1].

REFERENCES

[1] Evans, Pitman, Winter – Rayleigh processes, real trees, and root growth with regrafting.

Advertisement

1 thought on “Gromov-Hausdorff Distance on Trees

  1. Pingback: Real Trees – Root Growth and Regrafting | Eventually Almost Everywhere

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s