# Long Paths and Expanders

I’m in Birmingham this week for the LMS-EPSRC summer school on Random Graphs, Geometry and Asymptotic Structure. The event consists of three five-hour mini-courses, a plenary lecture, leaving plenty of time for problem sheet and discussion. I thought it would be worth trying to say a couple of interesting things each day – I do not know whether this will succeed, but I might as well try.

Today, a few thoughts on the first two lectures of Michael Krivelevich’s course on Long Paths and Hamiltonicity in Random Graphs. The aim is to develop tools to investigate the threshold for the presence of a Hamiltonian cycle in G(n,p). In this first part of the course, we were mainly thinking about long paths.

One tool we used a lot was the Depth-First Search algorithm. This is very similar to the exploration process I’ve talked about before. Essentially, here we consider trying to explore the graph in a depth-first way, but instead of viewing all the edges incident to a vertex we have just arrived at, we only look to see whether there is an edge out of the new vertex. If there is, we explore it, then come back eventually to look for more. It really comes down to a difference in the information we are storing. In this DFS, we store the vertices which we haven’t finished exploring, which is the set of vertices on the explored path between the root and the current vertex. So the size of this set evolves like the contour process. In particular, we can read off the sizes of paths from this description. These dynamics are useful in particular because we know there are no edges between the set of vertices we have finished exploring, and the ones we have yet to explore. The stack of ‘processing’ vertices must glue everything else together.

We can translate one of the arguments back into the language for the old exploration process. Recall the increments of the exploration process are $\mathrm{Bin}(\alpha n,\frac{c}{n}) -1$ once we have explored $\alpha n$ vertices. We don’t need to worry about the -1 bit for now. Observe that because we are exploring in a depth-first way, if a subsequence of the Binomial variables of length k are all positive, this corresponds to a path of length (k-1).

So to prove, for example, that the longest path in a subcritical random graph is O(log n), it suffices to prove that there are O(log n) consecutive positive entries in the sequence of n binomial entries. Since the distribution changes continuously, it is convenient to prove that there are O(log n) consecutive positive entries in the first $\epsilon n$ binomial entries. The probability that any of these entries is positive is bounded below by some p, so it suffices to consider instead a sequence of Bernoulli RVs with parameter p. So if we never have clog n consecutive, this gives control of the sequence of geometric random variables corresponding to the gaps between 0s in the sequence. Precisely, these are Geom(q), and we must have $\frac{\epsilon n}{c\log n}$ of them independently being less than clog n. We have to chase a few constants, and use the fact that if $f(n)\rightarrow\infty, \frac{g(n)}{f(n)}\rightarrow\infty$, then

$(1-\frac{1}{f(n)})^{g(n)}\rightarrow 0,$

by comparison with the standard asymptotic result for $e^{-x}$. In any case, we get that this probability tends to 0 if we choose c small enough, and so with high probability there is a path of length clog n.

This is interesting, because we knew already that the largest component in a subcritical random graph had size O(log n). But we also knew that all the components were trees, or ‘almost trees’, and were uniformly chosen from the set of trees (or trees + an edge or two) with appropriate size. And the largest path in a UST on n vertices is $O(n^{1/2})$ with high probability. So we learn that there are enough components of size $\geq c\log n$ that it is actually very probable that one of them will have the unlikely property of being much more path-like than a typical tree.

Krivelevich also showed a pleasant elementary proof of the result that a supercritical random graph has a path of length O(n), using a similar idea.

The other definition of major interest was an expander graph. Often when doing calculations about neighbourhoods of sets of vertices, we run into the problem that the neighbourhoods may overlap, and so we cannot get the total outer neighbourhood (or outer boundary) just by summing over the individual neighbourhood sizes. In an expander graph, we demand that all small sets of vertices have neighbourhood at least as large as some constant multiple of the set size, essentially giving us a bound on the above problem. Concretely, G is a $(k,\alpha)$-expander is for any set of vertices $|U|\leq k, |N(U)|\geq \alpha |U|$.

There’s a very nice argument using Posa’s lemma, where we consider all the possible ways to rearrange the vertices in some longest path into a different longest path, and then focus on the endpoints of all these paths. With this so-called rotation-extension technique, we can show that a (k,2)-expander has a path of length at least 3k-1.

There are structural similarities between expander graphs and regular graphs, so it seems natural that there will be some interesting spectral properties. I don’t know much about this, but perhaps it will come up later in the week. But, returning to the random graph long path problem, it now suffices to show subcritical G(n,p) is a (clog n,2)-expander for some c. Expander properties are in some sense the opposite of clustering properties, and independence of a RG inhibit most clustering properties (as discussed in much greater detail in some of the posts about network models). Unfortunately, this doesn’t actually work, as in a subcritical graph, the typical expansion coefficient, even of a small set will be c, for G(n,c/n), which is not large enough. However, if you chose the constants carefully, such an argument should work for c>2, so long as you chose k=an, with a small enough that the probability of a vertex elsewhere in the graph being joined to (at least) two of the k vertices in the set, was small compared with (c-2).

REFERENCES

The course notes are not available, though chapter 3 from these 2010 notes by the same lecturer are related and interesting.