# Analytic vs Probabilistic Arguments for a Supercritical BP

This follows on directly from the previous post. I was originally going to talk only about what follows, but I got rather carried away with the branching process account. I was stuck on a particular exercise, and we ended up coming up with two arguments: one analytic and one probabilistic. Since the typical flavour of this blog is to present problems which show the advantage of the probabilistic approach, it seems only fair to remark on this case, where the analytic method was less interesting, but much simpler.

Recall that we have a supercritical random graph $G(n,\frac{\lambda}{n}), \lambda>1$, and we are considering the rescaled exploration process $S_{nt}$, which has asymptotic mean $\mu_t=1-t-e^{-\lambda t}$. We can calculate similarly an expression for the asymptotic variance

$\frac{\text{Var}(S_{nt})}{n}\rightarrow v_t=e^{-\lambda t}(1-e^{-\lambda t}).$

To use this to verify the result about the size of the giant component, we verify that $\mu_{\zeta_\lambda+x/\sqrt{n}}$ is negative, and has small variance, which would confirm that the giant component has size bounded above by $\zeta_\lambda$ almost surely. A similar argument is required for the lower bound. The variance is a separate matter, but it is therefore necessary that $\mu_t$ should be decreasing at $t=\zeta_\lambda$, that is $\mu_t'=\lambda e^{-\lambda \zeta_\lambda}<0$. This is what we try to prove in the remainder of this post. Recall that in the previous post we have checked that it is equal to zero here.

Heuristic Explanation

$\mu_t$ has been rescaled from the original definition of the exploration process in both size and time-scale so some care is needed to see why this should hold in the limit. Remember that all components apart from the giant component are of size O(log n). So immediately after exhausting the giant component, you are likely to be visiting components of size roughly log n. A time interval of dt for $\mu$ corresponds to ndt for S, during which S will visit some components of size log n and some of O(1) and some in between. In particular, some fixed proportion of vertices are isolated, that is, in a component of size 1.

There is then a complicated size-biasing train of thought. A component of size log n is more likely to come up than an isolated vertex, but there are not as many of them. The log n components push the derivative $\mu_t'$ towards zero, because S_t decreases by 1 over a time-interval of length log n, which gives a gradient of zero in the limit. However, the isolated vertices give a gradient of -1, because S_t decreases by 1 over a time interval of 1. Despite the fact that log n intervals are likely to appear earlier, it still remains the case that after exhausting a component (in particular, at time $t=\zeta_\lambda$, after exhausting the giant component), with some bounded below positive probability you will choose an isolated vertex next. The component size only affects that time-scale if it is O(n), which none of the remaining components are, so the derivative $\mu_{\zeta_\lambda}'$ consists of some complicated weighted mean of 0 and -1. In particular, it is negative.

Analytic solution

Obviously, that won’t do in practice. Suppressing lambdas for ease of notation, the key fact is: $e^{-\lambda \zeta}=1-\zeta$. We want to show that $\lambda e^{-\lambda \zeta}<1$. Substituting

$\lambda=-\frac{\log(1-\zeta)}{\zeta},$

means that it is required to show:

$-\frac{1-\zeta}{\zeta}\log(1-\zeta)<1.$

Differentiating the left hand side gives:

$\frac{\log(1-\zeta)+\zeta}{\zeta^2}<0,$

since of course $\log(1-\zeta)=\zeta+\frac{\zeta^2}{2}+\frac{\zeta^3}{3}+\dots$. So it suffice to check the result for small $\zeta$. But, again using a Taylor series:

$-\frac{1-\zeta}{\zeta}\log(1-\zeta)=1-\frac12\zeta+O(\zeta^2)<1,$

for small $\zeta$. This gives the required result.

Probabilistic Interpretation and Solution

First, we observe that $\lambda e^{-\lambda\zeta}=\lambda(1-\zeta)$ is the expected number of vertices in the first generation of a $\text{Po}(\lambda)$ whose progeny become extinct. This motivates considering the canonical decomposition of a supercritical branching process Z into the skeleton process and the dual process. The skeleton $Z^+$ consists of all vertices which have infinitely many successors. It is relatively easy to show that this is a branching process with offspring distribution $\text{Po}(\lambda\zeta)$ conditioned on being positive. The dual process $Z^*$ is a G-W branching process with offspring distribution $\text{Po}(\lambda)$ conditioned on dying. This is the same as a branching process with offspring distribution $\text{Po}(\lambda(1-\zeta)$, by a sprinkling argument, which says that if we begin with a Poisson number of things, then remove each one independently with some fixed probability, the remaining number of things is Poisson also.

We can construct the original branching process by

• With probability $\zeta$, take the skeleton, and affixe independent copies of $Z^*$ at every vertex in the skeleton.
• With probability $1-\zeta$, just take a copy of $Z^*$.

It is immediately clear that $\lambda(1-\zeta)\leq 1$. After all, the dual process is almost surely finite, so the offspring distribution cannot have expectation greater than 1. Checking that this is strong is more fiddly. The best way I have come up with is to examine the tail of the distribution of total population size of the original branching process.

The total population size T of a branching process has an exponential tail if the offspring distribution is subcritical. It isn’t hugely surprising that this behaves like a large deviation for iid RVs, since in the limit such an event requires a lot of the offspring counts to deviate substantially from the mean. The same holds in the supercritical case, with the additional complication that though the finite tail decays exponential, there is positive probability that the total size will be infinite. In the critical case, however, there is a power-law decay. This is not hugely surprising as it marks the threshhold for the appearance of the infinite population, just as in a multiplicative coalescent at time 1, we have a load of very large components just about to form a giant component. The tool for all of these results is Dwass’s Theorem, which says:

$\mathbb{P}(T=n)=\frac{1}{n}\mathbb{P}(X_1+\ldots+X_n=n-1),$

where $X_1$ are iid with the offspring distribution. When $\mathbb{E}X_1\neq 1$, this is a large deviation event, for which Cramer’s theorem applies (assuming, as is the case for the Poisson distribution, that the offspring distribution has finite variance). When, $\mathbb{E}X=1$, the Central Limit Theorem says that with high probability,

$X_1+\ldots+X_n\in [n-n^{3/4},n+n^{3/4}],$

so, skating over the details of whether everything is exactly uniform within this CLT scaling window,

$\mathbb{P}(T=n)\geq \frac{1}{n}\cdot\frac{1}{2n^{3/4}}.$

The true exponent of the power law decay is substantially slower than this, but the above argument works as a back-of-the-envelope bound.

In particular, if the dual process has mean 1, then the population size of the original branching process is given by taking a distribution with exponential tail with some probability and a distribution with power-law tail with some probability. Obviously the power-law will dominate, which contradicts the assumption that the original branching process was supercritical, and so has an exponential tail.