# Random walks conditioned to stay positive

In this post, I’m going to discuss some of the literature concerning the question of conditioning a simple random walk to lie above a line with fixed gradient. A special case of this situation is conditioning to stay non-negative. Some notation first. Let $(S_n)_{n\ge 0}$ be a random walk with IID increments, with distribution X. Take $\mu$ to be the expectation of these increments, and we’ll assume that the variance $\sigma^2$ is finite, though at times we may need to enforce slightly stronger regularity conditions.

(Although simple symmetric random walk is a good example for asymptotic heuristics, in general we also assume that if the increments are discrete they don’t have parity-based support, or any other arithmetic property that prevents local limit theorems holding.)

We will investigate the probability that $S_n\ge 0$ for n=0,1,…,N, particularly for large N. For ease of notation we write $T=\inf\{n\ge 0\,:\, S_n<0\}$ for the hitting time of the negative half-plane. Thus we are interested in $S_n$ conditioned on T>N, or T=N, mindful that these might not be the same. We will also discuss briefly to what extent we can condition on $T=\infty$.

In the first paragraph, I said that this is a special case of conditioning SRW to lie above a line with fixed gradient. Fortunately, all the content of the general case is contained in the special case. We can repose the question of $S_n$ conditioned to stay above $n\alpha$ until step N by the question of $S_n-n\alpha$ (which, naturally, has drift $\mu-\alpha$) conditioned to stay non-negative until step N, by a direct coupling.

Applications

Simple random walk is a perfectly interesting object to study in its own right, and this is a perfectly natural question to ask about it. But lots of probabilistic models can be studied via naturally embedded SRWs, and it’s worth pointing out a couple of applications to other probabilistic settings (one of which is the reason I was investigating this literature).

In many circumstances, we can desribe random trees and random graphs by an embedded random walk, such as an exploration process, as described in several posts during my PhD, such as here and here. The exploration process of a Galton-Watson branching tree is a particularly good example, since the exploration process really is simple random walk, unlike in, for example, the Erdos-Renyi random graph G(N,p), where the increments are only approximately IID. In this setting, the increments are given by the offspring distribution minus one, and the hitting time of -1 is the total population size of the branching process. So if the expectation of the offspring distribution is at most 1, then the event that the size of the tree is large is an atypical event, corresponding to delayed extinction. Whereas if the expectation is greater than one, then it is an event with limiting positive probability. Indeed, with positive probability the exploration process never hits -1, corresponding to survival of the branching tree. There are plenty of interesting questions about the structure of a branching process tree conditional on having atypically large size, including the spine decomposition of Kesten [KS], but the methods described in this post can be used to quantify the probability, or at least the scale of the probability of this atypical event.

In my current research, I’m studying a random walk embedded in a construction of the infinite-volume DGFF pinned at zero, as introduced by Biskup and Louidor [BL]. The random walk controls the gross behaviour of the field on annuli with dyadically-growing radii. Anyway, in this setting the random walk has Gaussian increments. (In fact, there is a complication because the increments aren’t exactly IID, but that’s definitely not a problem at this level of exposition.) The overall field is decomposed as a sum of the random walk, plus independent DGFFs with Dirichlet boundary conditions on each of the annuli, plus asymptotically negligible corrections from a ‘binding field’. Conditioning that this pinned field be non-negative up to the Kth annulus corresponds to conditioning the random walk to stay above the magnitude of the minimum of each successive annular DGFF. (These minima are random, but tightly concentrated around their expectations.)

Conditioning on $\{T > N\}$

When we condition on $\{T>N\}$, obviously the resulting distribution (of the process) is a mixture of the distributions we obtain by conditioning on each of $\{T=N+1\}, \{T=N+2\},\ldots$. Shortly, we’ll condition on $\{T=N\}$ itself, but first it’s worth establishing how to relate the two options. That is, conditional on $\{T>N\}$, what is the distribution of T?

Firstly, when $\mu>0$, this event always has positive probability, since $\mathbb{P}(T=\infty)>0$. So as $N\rightarrow\infty$, the distribution of the process conditional on $\{T>N\}$ converges to the distribution of the process conditional on survival. So we’ll ignore this for now.

In the case $\mu\le 0$, everything is encapsulated in the tail of the probabilities $\mathbb{P}(T=N)$, and these tails are qualitatively different in the cases $\mu=0$ and $\mu<0$.

When $\mu=0$, then $\mathbb{P}(T=N)$ decays polynomially in N. In the special case where $S_n$ is simple symmetric random walk (and N has the correct parity), we can check this just by an application of Stirling’s formula to count paths with this property. By contrast, when $\mu<0$, even demanding $S_N=-1$ is a large deviations event in the sense of Cramer’s theorem, and so the probability decays exponentially with N. Mogulskii’s theorem gives a large deviation principle for random walks to lie above a line defined on the scale N. The crucial fact here is that the probabilistic cost of staying positive until N has the same exponent as the probabilistic cost of being positive at N. Heuristically, we think of spreading the non-expected behaviour of the increments uniformly through the process, at only polynomial cost once we’ve specified the multiset of values taken by the increments. So, when $\mu<0$, we have

$\mathbb{P}(T\ge(1+\epsilon)N) \ll \mathbb{P}(T= N).$

Therefore, conditioning on $\{T\ge N\}$ in fact concentrates T on N+o(N). Whereas by contrast, when $\mu=0$, conditioning on $\{T\ge N\}$ gives a nontrivial limit in distribution for T/N, supported on $[1,\infty)$.

A related problem is the value taken by $S_N$, conditional on {T>N}. It’s a related problem because the event {T>N} depends only on the process up to time N, and so given the value of $S_N$, even with the conditioning, after time N, the process is just an unconditioned RW. This is a classic application of the Markov property, beloved in several guises by undergraduate probability exam designers.

Anyway, Iglehart [Ig2] shows an invariance principle for $S_N | T>N$ when $\mu<0$, without scaling. That is $S_N=\Theta(1)$, though the limiting distribution depends on the increment distribution in a sense that is best described through Laplace transforms. If we start a RW with negative drift from height O(1), then it hits zero in time O(1), so in fact this shows that conditonal on $\{T\ge N\}$, we have T= N +O(1) with high probability. When $\mu=0$, we have fluctuations on a scale $\sqrt{N}$, as shown earlier by Iglehart [Ig1]. Again, thinking about the central limit theorem, this fits the asymptotic description of T conditioned on T>N.

Conditioning on $T=N$

In the case $\mu=0$, conditioning on T=N gives

$\left[\frac{1}{\sqrt{N}}S(\lfloor Nt\rfloor ) ,t\in[0,1] \right] \Rightarrow W^+(t),$ (*)

where $W^+$ is a standard Brownian excursion on [0,1]. This is shown roughly simultaneously in [Ka] and [DIM]. This is similar to Donsker’s theorem for the unconditioned random walk, which converges after rescaling to Brownian motion in this sense, or Brownian bridge if you condition on $S_N=0$. Skorohod’s proof for Brownian bridge [Sk] approximates the event $\{S_N=0\}$ by $\{S_N\in[-\epsilon \sqrt{N},+\epsilon \sqrt{N}]\}$, since the probability of this event is bounded away from zero. Similarly, but with more technicalities, a proof of convergence conditional on T=N can approximate by $\{S_m\ge 0, m\in[\delta N,(1-\delta)N], S_N\in [-\epsilon \sqrt{N},+\epsilon\sqrt{N}]\}$. The technicalities here emerge since T, the first return time to zero, is not continuous as a function of continuous functions. (Imagine a sequence of processes $f^N$ for which $f^N(x)\ge 0$ on [0,1] and $f^N(\frac12)=\frac{1}{N}$.)

Once you condition on $T=N$, the mean $\mu$ doesn’t really matter for this scaling limit. That is, so long as variance is finite, for any $\mu\in\mathbb{R}$, the same result (*) holds, although a different proof is in general necessary. See [BD] and references for details. However, this is particularly clear in the case where the increments are Gaussian. In this setting, we don’t actually need to take a scaling limit. The distribution of Gaussian *random walk bridge* doesn’t depend on the mean of the increments. This is related to the fact that a linear transformation of a Gaussian is Gaussian, and can be seen by examining the joint density function directly.

Conditioning on $T=\infty$

When $\mu>0$, the event $\{T=\infty\}$ occurs with positive probability, so it is well-defined to condition on it. When $\mu\le 0$, this is not the case, and so we have to be more careful.

First, an observation. Just for clarity, let’s take $\mu<0$, and condition on $\{T>N\}$, and look at the distribution of $S_{\epsilon N}$, where $\epsilon>0$ is small. This is approximately given by

$\frac{S_{\epsilon N}}{\sqrt{N}}\stackrel{d}{\approx}W^+(\epsilon).$

Now take $\epsilon\rightarrow\infty$ and consider the RHS. If instead of the Brownian excursion $W^+$, we instead had Brownian motion, we could specify the distribution exactly. But in fact, we can construct Brownian excursion as the solution to an SDE:

$\mathrm{d}W^+(t) = \left[\frac{1}{W^+(t)} - \frac{W^+(t)}{1-t}\right] \mathrm{d}t + \mathrm{d}B(t),\quad t\in(0,1)$ (**)

for B a standard Brownian motion. I might return in the next post to why this is valid. For now, note that the first drift term pushes the excursion away from zero, while the second term brings it back to zero as $t\rightarrow 1$.

From this, the second drift term is essentially negligible if we care about scaling $W^+(\epsilon)$ as $\epsilon\rightarrow 0$, and we can say that $W^+(\epsilon)=\Theta(\sqrt{\epsilon})$.

So, returning to the random walk, we have

$\frac{S_{\epsilon N}}{\sqrt{\epsilon N}}\stackrel{d}{\approx} \frac{W^+(\epsilon)}{\sqrt{\epsilon}} = \Theta(1).$

At a heuristic level, it’s tempting to try ‘taking $N\rightarrow\infty$ while fixing $\epsilon N$‘, to conclude that there is a well-defined scaling limit for the RW conditioned to stay positive forever. But we came up with this estimate by taking $N\rightarrow\infty$ and then $\epsilon\rightarrow 0$ in that order. So while the heuristic might be convincing, this is not the outline of a valid argument in any way. However, the SDE representation of $W^+$ in the $\epsilon\rightarrow 0$ regime is useful. If we drop the second drift term in (**), we define the three-dimensional Bessel process, which (again, possibly the subject of a new post) is the correct scaling limit we should be aiming for.

Finally, it’s worth observing that the limit $\{T=\infty\}=\lim_{N\rightarrow\infty} \{T>N\}$ is a monotone limit, and so further tools are available. In particular, if we know that the trajectories of the random walk satisfy the FKG property, then we can define this limit directly. It feels intuitively clear that random walks should satisfy the FKG inequality (in the sense that if a RW is large somewhere, it’s more likely to be large somewhere else). You can do a covariance calculation easily, but a standard way to show the FKG inequality applies is by verifying the FKG lattice condition, and unless I’m missing something, this is clear (though a bit annoying to check) when the increments are Gaussian, but not in general. Even so, defining this monotone limit does not tell you that it is non-degenerate (ie almost-surely finite), for which some separate estimates would be required.

A final remark: in a recent post, I talked about the Skorohod embedding, as a way to construct any centered random walk where the increments have finite variance as a stopped Brownian motion. One approach to conditioning a random walk to lie above some discrete function is to condition the corresponding Brownian motion to lie above some continuous extension of that function. This is a slightly stronger conditioning, and so any approach of this kind must quantify how much stronger. In Section 4 of [BL], the authors do this for the random walk associated with the DGFF conditioned to lie above a polylogarithmic curve.

References

[BD] – Bertoin, Doney – 1994 – On conditioning a random walk to stay nonnegative

[BL] – Biskup, Louidor – 2016 – Full extremal process, cluster law and freezing for two-dimensional discrete Gaussian free field

[DIM] – Durrett, Iglehart, Miller – 1977 – Weak convergence to Brownian meander and Brownian excursion

[Ig1] – Iglehart – 1974 – Functional central limit theorems for random walks conditioned to stay positive

[Ig2] – Iglehart – 1974 – Random walks with negative drift conditioned to stay positive

[Ka] – Kaigh – 1976 – An invariance principle for random walk conditioned by a late return to zero

[KS] – Kesten, Stigum – 1966 – A limit theorem for multidimensional Galton-Watson processes

[Sk] – Skorohod – 1955 – Limit theorems for stochastic processes with independent increments

# Duality for Interacting Particle Systems

Yesterday I introduced the notion of duality for two stochastic processes. My two goals for this post are to elaborate on the idea of why duality is useful, which we touched on in passing in the previous part, and to discuss duality of interacting particle systems. In the latter case, there are often nice ways to consider the forward and backward processes together that make the relation somewhat more natural.

The starting point is to assume a finite state space. This will be reasonable when we start to consider interacting particle systems, eg on $\{0,1\}^{[n]}$. As before, call the spaces R and S, and a duality function H(x,y). Since the state-spaces are finite, it is entirely natural to think of this as a matrix, and hence as an operator. Of course, a function defined on a finite state-space can be thought of as a vector, so it is clear what this operator will actually operate on. (I’ve chosen H rather than h for the duality function so it is more clear that it is acting as an operator here.)

We have some choice about which way round to define it, but for now let’s say that given some function f(.) on S

$Hf(x):=\sum_{y\in S} H(x,y)f(y).$

Note that this is a) exactly the definition of matrix (left-)multiplication; b) We should think of Hf as a function on R – perhaps (Hf)(x) might be more clear? and c) the operator H acts $\mathbb{R}^S\rightarrow \mathbb{R}^R$. If we want the corresponding operator $\mathbb{R}^R\rightarrow\mathbb{R}^S$, we simply multiply by H on the right instead.

But note also that the generator of a finite state-space Markov process is also a matrix, indeed a Q-matrix. So if we take our definition of the duality function as

$\mathcal{G}_X h(x,y)=\mathcal{G}_Y h(x,y),$

which, importantly, holds for all x,y, we can convert this into an algebraic form as

$\mathcal{G}_X H = H \mathcal{G}_Y^\dagger.$

In the same way that n-step transition probabilities for a discrete-time Markov chain are given by the product of the one-step transition matrix, general time transition probabilities for a continuous-time Markov chain are given by exponents of the Q-matrix. In particular, if X and Y have transition kernels P and Q respectively, then $P_t=e^{tG_X}$, and after doing some manipulation, we can show that

$P_t H=H Q_t^\dagger,$

also. This is really useful as in general we would hope that H might be invertible, from which we derive

$P_t=HQ_t^\dagger H^{-1}.$

So this is a powerful statement about the relationship between the evolutions of the two processes. In particular, it shows a correspondence (given by H) between left eigenvectors of P, and right eigenvectors of Q, and vice versa naturally.

The reason why this is useful rather than merely cute, is that when we re-interpret everything in terms of the original stochastic processes, we get a map between stationary distributions of X, and harmonic functions of Y. Stationary distributions are often hard to describe in any terms other than the left-1-eigenvector, or through some convergence property that is typically hard to work with. Harmonic functions, on the other hard, can be much more tractable. An example of a harmonic function is the survival probability started from a given state. This is useful for specifying the stationary distribution, but perhaps even more so for describing properties of the set of stationary distributions. In particular, uniqueness and existence are carried across this equivalence. So, for example, if the dual does not survive almost surely, then this says the only stationary measure is zero, and so the process is transient or similar.

Jan Swart’s course in Luminy last October dealt with duality, with a focus mainly on interacting particle systems. There are a couple of themes I want to talk about, without going into too much detail.

A typical interacting particle system will take place on a locally finite graph. At each vertex, there is either a particle, or there isn’t. Particles move between adjacent vertices, and sometimes interact with particles at adjacent vertices. These interactions might involve branching or coalescence. We will discuss shortly the set of possible forms such interaction might take. The state space is $\{0,1\}^{V(G)}$, with G the underlying graph. Then given a state, there is some set of actions which might happen next, and we consider the possibility that they happen with exponential rates.

At this stage, it seems like the initial configuration is important, as this affects what set of moves can happen immediately, and also thereafter. It is not clear how quickly this dependence fades. One useful idea is not to restrict ourselves to interactions involving the particles currently present in the system, but instead to consider a Poisson process of all possible interactions. Only the moves actually permitted by the current state will happen, but having this extra information allows for coupling between initial configurations.

It’s probably easier to consider a concrete example. The picture below shows the set-up for a branching random walk up an integer lattice. Each particle moves to one of the two state directly above its current state, or it branches and sends particles to both of them.In the diagram, we have glued arrows onto every state at every time, which tells us what to do if there is a particle there at each time. As a coupling, we can now think of the process as a deterministic walk through a random environment. The environment is given by some probability space, which in continuous time might have the appearance of a Poisson process on the set of ‘moves’, and the initial condition of the walk is up to us.

We can generalise this to a broader class of interacting particle systems. If we want all interactions to be between pairs of adjacent states, there are six possible things which could happen:

• Annihilation: two adjacent particles destroy each other. ( 11 -> 00 )
• Branching: one particle becomes two particles. ( 01 or 10 -> 11 )
• Coalescence: two particles merge. ( 11 -> 01 or 10 )
• Death: A particle is removed. ( 01 or 10 -> 00 )
• Exclusion: a particle moves. ( 01 -> 10 )
• Birth: a particle is created. ( 00 -> 01 or 10 )

For now we exclude the possibility of birth. Note that the way we have set this up involving two-site interactions excludes the possibility of a particle trying to move to an already-occupied site.

Let us say that in process X the rates at which each of these events happen are a, b, c, d and e, taking advantage of the helpful choice of naming. There is some flexibility about whether the rates are the same between every pair of vertices of note. For this post we assume that they are. Then it is a result of Lloyd and Sudbury that given some real $q\neq 1$, the process X’ with corresponding rates given by:

$a'=a+2q\gamma, b'=b+\gamma, c'=c-(1+q)\gamma, d'=d+\gamma, e'=e-\gamma,$

for $\gamma:= \frac{a+c-d+qb}{1-q},$

is dual to X, with duality function given by $h(Y,Z)=q^{|Y\cap Z|}$, for Y and Z possible states.

I want to make two comments:

1) This illustrates one of the differences between the dual and the time-reversal. It is clear that the time-reversal of branching is coalescence and vice versa, and exclusion is invariant under time-reversal. But the time-reversal of death is definitely birth, but there is no birth component in the dual of a process which features death. I don’t have a strong intuition for why this is the case, but see the final paragraph of this post. However, at least it seems plausible that both processes might simultaneously be recurrent, since in the dual, both the branching rate and the death rate have increased by the same amount.

2) This settles one problem of uniqueness of the dual that I mentioned last time, since we can vary q and get a different dual to the same original process. For example, in the voter model, we have b=d=1, and a=c=e=0, as in any update, the opinions of neighbours which were previously different become the same. Anyway, for any $q\in[-1,0]$ there is a choice of dual, where at the extremes q=0 corresponds to coalescing random walk, and q=-1 to annihilating random walk. (Note that the duality function for q=0 is the indicator function that the systems are different.)

As a final observation without much justification, suppose we add in arrows in the gaps of the branching random walk picture we had earlier, and direct them in the opposite direction. It turns out that this corresponds precisely to the dual of the process. This provides an appealing visual idea of why the dual of branching might be death. It also supports the general idea based on the coupling described earlier that the dual process is in some sense a deterministic walk in the opposite direction through the random environment specified by the original process.

REFERENCES

J.M. Swart – Duality and Intertwining of Markov Chains (mainly using chapters 2.1 and 2.7)

Thanks for Daniel Straulino for direction towards the branching random walk duality example.

# Branching Random Walk and Amenability

This post is about some of the things I learned in an interesting given by Elisabetta Candellero in Oxford last week, based on joint work with Matt Roberts. The paper on which this is based can be found here. The main thing I want to talk about are some properties of graphs which were mentioned near the beginning which I hadn’t heard about before.

Branching Random Walk (hereafter BRW) is a model to which much attention has been paid, because of its natural applications in a range of physical and genetic settings. As with many of the best models, the definition is pretty much in the title. We take the ingredients for a random walk on a graph, which is a graph, and a transition matrix P on that graph. For most of the time we will consider simple random walk, so the graph G exactly specifies P. This requires the additional condition that the graph G is locally finite. We will introduce a branching mechanism, so at discrete times {0,1,2,…} we will track both the number of particles, and their current locations. We start at time 0 with a single particle at some vertex. Then at each time-step, all the vertices present die, and each gives birth independently to some number of offspring according to a fixed probability distribution $\mu$. These offspring then perform one move according to transition matrix P. Note that if you want the system to carry the appearance of having no death, then taking the support of the offspring distribution to be {1,2,3,…} achieves precisely this. The properties we consider will not be very interesting unless G is infinite, so assume that from now on.

There are almost limitless ways we could think of to generalise these dynamics. The offspring distribution could be allowed to depend on the vertex the particle is occupying. The joint transition probabilities of the offspring at a vertex could be biased in favour or against the offspring moving to the same site next. The environment could be chosen in advance before the process starts, but random.

The classical question about BRW is that of recurrence and transience. The definition extends naturally from that of a Markov chain (which any non-branching random walk on a graph is). As in that setting, we say a BRW is recurrent if every vertex is almost surely visited infinitely often by particles of the graph.

Heuristically, we should observe that in some sense, it is quite difficult for simple random walk on an infinite graph to be recurrent. We have examples in $\mathbb{Z},\mathbb{Z}^2$, but these are about as ‘small’ as an infinite graph can be. An idea might be that if the number of sites some distance away from where we start grows rapidly as the distance grows, then there isn’t enough ‘pull’ back to visit the sites near where we start infinitely often. Extending this argument, it is easier for a BRW to be recurrent, as we have the option to make the branching rate large, which means that there are lots of particles at large times, hence more possibility for visiting everywhere. Note that if the offspring distribution is subcritical, we don’t stand a chance of having interesting properties. If we ignore the random walk part, we just have a subcritical Galton-Watson process, which dies out almost surely.

We need a measure of the concept discussed in the heuristic for how fast the number of vertices in the graph grows as we consider bands of vertices further and further away from the starting vertex. The standard measure for this is the spectral radius, which is defined not in terms of number of vertices, but through the limiting probability of returning to a fixed vertex at large time n. Precisely

$\rho:= \limsup \mathbb{P}_i(X_n=i)^{1/n},$

so in some approximation sense

$\mathbb{P}_i(X_n=i)\sim \rho^{n},$

which explains why $\rho\le 1$. Note that by considering the sum of such terms, if simple random walk on G is recurrent, then $\rho=1$, but the converse does not hold. (Consider SRW on $\mathbb{Z}^3$ for example.)

It’s also worth remarking that $\rho$ is a class property. In particular, for a connected graph, the value of $\rho$ is independent of i. This is not surprising, as if d is the graph distance between vertices i and j, then

$p_{ii}^{(n)}\ge p_{ij}^{(d)}p_{jj}^{(n-2d)}p_{ji}^{(d)},$

and vice versa, which enables us to sandwich usefully for the limits.

Really, $\rho$ is a function of the transition matrix P. In fact, we can be more specific, by considering diagonalising P. The only case we care about is when P is infinite, so this is not especially nice, but it makes it clear why $p_{ii}^{(n)}$ decays like $|\rho|^n$ where $\rho$ is the largest eigenvalue of P. Indeed this is an alternative definition of the spectral radius. Note that Perron-Frobenius theory (which seems to keep coming up on the blog this week…) says that since $|\rho|\le 1$, then if $|\rho|=1$, we must have $\rho=1$. So the spectral radius being 1 is precisely equivalent to having an invariant measure. We don’t know whether we can normalise it, but P-F guarantees the relevant left-eigenvector is non-negative, and hence a measure.

Next we give this situation a name. Say that a random walk is amenable if $\rho(P)=1$. We can extend this property to say that a graph is amenable if SRW on it is amenable.

This is not the standard definition of amenability. This property is originally defined (by von Neumann) in the context of groups. A group G is said to be amenable if there exists a left-invariant probability measure on G, ie $\mu$ such that

$\forall A\subset G, \forall g\in G, \mu(gA)=A.$

The uniform distribution shows that any finite group is amenable.

It turns out that in general there are several conditions for a group which are equivalent to amenability. One is that, given G finitely generated by B, the Cayley graph for G with edges given by elements of B does not satisfy a strong isoperimetric inequality. Such an inequality is an alternative way of saying that the graph grows rapidly. It says that the size of the boundary of a subset of the vertices is uniformly large relative to the size of the set. Precisely, there exists a constant c>0 such that whenever U is a finite subset of the vertices, we have $|\partial U|\ge c|U|$. (Note that finiteness of U is important – we would not expect results like this to hold for very large subsets.)

Kesten proved that it is further equivalent to the statement that simple random walk on $Cay(G,B)$ is amenable in our original sense. This technical and important result links the two definitions.

We finish by declaring the main classical result in BRW, which is a precise condition for transience. As motivated earlier, the rate of branching and the spectral radius have opposing effects on whether the system is recurrent or transient. Note that at some large time, the expected number of particles which have returned to the starting vertex is given by the expected number of particles in the system multiplied by the probability that any one of them is back at its origin, ie $\sim \mu^n\rho^n$. So the probability that there is a particle back at the origin at this time is (crudely transferring from expectation to probability) $1\wedge (\mu \rho)^n$. We can conclude that the chain is recurrent if $\mu > \rho^{-1}$ and transient if $\mu<\rho^{-1}$. This result is due to Benjamini and Peres.

The remaining case, when $\mu=\rho^{-1}$ is called, unsurprisingly, critical BRW. It was proved in ’06 by Gantert and Muller that, in fact, all critical BRWs are transient too. This must exclude the amenable case, as we could think of SRW on $\mathbb{Z}$ as a critical BRW by taking the branching distribution to be identically one, as the spectral radius is also 1.

In the end, the material in this post is rather preliminary to the work presented in EC’s talk, which concerned the trace of BRW, and whether there are infinitely many essentially different paths to infinity taken by the particles of the BRW. They show that this holds in a broad class of graphs with symmetric properties.