DGFF 4 – Properties of the Green’s function

I’m at UBC this month for the PIMS probability summer school. One of the long courses is being given by Marek Biskup about the Discrete Gaussian Free Field (notes and outline here) so this seems like a good moment to revive the sequence of posts about the DGFF. Here’s DGFF1, DGFF2, DGFF3 from November.

The first draft of this post was about the maximum of the DGFF in a large box V_N, and also about the Green’s function G^{V_N}(x,y), which specifies the covariance structure of the DGFF. This first draft also became too long, so I’m splitting it into two somewhat shorter ones. As we’ll see, some understanding and standard estimates of the Green’s function is enough to say quite a bit about the maximum. In this first post, we’ll explore some ‘low-hanging fruit’ concerning the Green’s function, as defined through a simple random walk, which are useful, but rarely explained in the DGFF literature.

Symmetry of Green’s function

We start with one of these low-hanging fruit. If G^{V_N} is to be a covariance matrix, it has to be symmetric. In the first post, showing that the definition of the DGFF as a random field with given Hamiltonian is equivalent to \mathcal{N}(0,G^{V_N}) certainly can be viewed as a proof of symmetry. However, it would be satisfying if there was a direct argument in the language of the definition of the Green’s function.

To make this self-contained, recall the random walk definition of G^{V_N}(x,y). Let (S_m)_{m\ge 0} be simple random walk on V_N, and \mathbb{P}_x,\,\mathbb{E}_x denote starting the random walk at x\in V_N. As usual, let \tau_y,\,\tau_A denote the hitting time of a vertex y or a set A respectively. Then

G^{V_N}(x,y):= \mathbb{E}_x \left[ \sum_{m=0}^{\tau_{\partial V_N}}1_{(S_m=y) }\right].

That is, G^{V_N}(x,y) is the expected number of visits to y by a random walk from x, before it exits V_N.

Let’s drop the superscript for now, as everything should hold for a more general subset of the lattice. I don’t think it’s immediately obvious at the level of Markov chains why G(x,y)=G(y,x). In particular, it’s not the case that

\mathbb{P}_x(\tau_y < \tau_{D^c}) = \mathbb{P}_y(\tau_x <\tau_{D^c}),

and it feels that we can’t map between paths x \to \partial D and y\to \partial D in a way that preserves the number of visits to y and x, respectively. However, we can argue that for any m

\mathbb{P}_x(S_m=y, \tau_{D^c}>m) = \mathbb{P}_y(S_m=x, \tau_{D^c}>m),

by looking at the suitable paths of (S_m). That is, if we have a path x=S_0,S_1,\ldots,S_m=y that stays within D, then the probability of seeing this path starting from x and its reverse direction starting from y are equal. Why? Because

\mathbb{P}_x(S_0=x,S_1=v_1,\ldots,S_{m-1}=v_{m-1},S_m=y) = \prod_{\ell=0}^{m-1} \frac{1}{\mathrm{deg}(v_\ell)},

and

\mathbb{P}_y(S_0=y,S_1=v_{m-1},\ldots,S_{m-1}=v_1, S_m=x) = \prod_{\ell=0}^{m-1} \frac{1}{\mathrm{deg}(v_{m-\ell})} = \prod_{\ell=1}^m \frac{1}{\mathrm{deg}(v_\ell)}.

Since D\subset \mathbb{Z}^d and x,y are in the interior of D, we must have \mathrm{deg}(x)=\mathrm{deg}(y), and so these two expressions are equal. Summing over all such two-way paths, and then all m gives the result.

Fixing one argument

We now focus on G^D(\cdot,y), where the second argument is fixed. This is the solution to the Poisson equation

\Delta G^D(\cdot,y) = -\delta_y(\cdot),\quad G^D(x,y)=0,\; \forall x\in \partial D.

To see this, can use a standard hitting probability argument (as here) with the Markov property. This is harmonic in D\backslash \{y\}, and since we know

G^D(y,y)= \frac{1}{\mathbb{P}_y(\text{RW hits }\partial D\text{ before returning to }y)},

this uniquely specifies G^D(\cdot,y). Anyway, since harmonic functions achieve their maxima at the boundary, we have G(y,y)\ge G(x,y) for all x\in D. We can also see this from the SRW definition as

G(x,y)=G(y,x) = \mathbb{P}_y (\tau_x < \tau_{\partial D} ) G(x,x) \le G(x,x).

Changing the domain

Now we want to consider nested domains D\subset E, and compare G^D(\cdot,\cdot) and G^E(\cdot,\cdot) on DxD. The idea is that for SRW started from x\in D, we have \tau_{\partial D}\le \tau_{\partial E}, since one boundary is contained within the other. From this, we get

G^D(x,y)\le G^E(x,y),\quad \forall x,y\in D,

and we will use the particular case y=x.

For example, if x\in V_N, the box with width N, then the box with width 2N centred on x contains the whole of V_N. So, if we set \bar {V}_{2N}:= [-N,N]^d, then with reference to the diagram, we have

G^{V_N}(x,x)\le G^{\bar{V}_{2N}}(0,0),\quad x\in V_N.

As we’ll see when we study the maximum of the DGFF on V_N, uniform control over the pointwise variance will be a useful tool.

Maximising the Green’s function

The idea of bounding G^{V_N}(x,x) by G^{\bar V_{2N}}(0,0) for any x\in V_N is clever and useful. But a more direct approach would be to find the value of x that maximises G^{V_N}(x,x). We would conjecture that when V_N has a central vertex, then this is the maximiser.

We can prove this directly from the definition of the Green’s function in terms of random walk occupation times. Let’s assume we are working with \bar{V}_N for even N, so that 0 is the central vertex. Again, since

G^D(x,x)=\frac{1}{\mathbb{P}_x(\text{RW hits }\partial D\text{ before returning to }x)}, (*)

it would suffice to show that this probability is minimised when x=0. This feels right, since 0 is furthest from the boundary. Other points are closer to the boundary in some directions but further in others, so we can’t condition on the maximum distance from its start point achieved by an excursion of SRW (we’re vertex-transitive, so these look the same from all starting points), as even allowing for the four possible rotations, for an excursion of diameter slightly larger than N, starting at the centre is maximally bad.

However, intuitively it does feel as if being closer to the boundary makes you more likely to escape earlier. In fact, with a bit more care, we can couple the SRW started from 0 and the SRW started from r=(r^x,r^y)\ne 0 such that the latter always exits first. For convenience we’ll assume also that r^x,r^y are both even.

I couldn’t find any reference to this, so I don’t know whether it’s well-known or not. The following argument involves projecting into each axis, and doing separate couplings for transitions in the x-direction and transitions in the y-direction. We assume WLOG that x is in the upper-right quadrant as shown. Then, let 0=S_0,S_1,S_2,\ldots be SRW started from 0, and we will construct r=R_0,R_1,R_2,\ldots on the same probability space as (S_m)_{m\ge 0} as follows. For every m, we set the increment R_{m+1}-R_m to be \pm(S_{m+1}-S_m). It remains to specify the sign, which will be determined by the direction of the S-increment, and a pair of stopping times. The marginal is therefore again an SRW, started from r. Temporarily, we use the unusual notation S_m= (S^x_m,S^y_m) for the coordinates of S_m.

So, if S_{m+1}-S_m=(1,0), (-1,0), ie S moves left or right, then we set

R_{m+1}-R_m = \begin{cases} -(S_{m+1}-S_m) &\quad \text{if }m<T^x\\ S_{m+1}-S_m&\quad \text{if }m>T^x.\end{cases} (*)

where T^x:= \min\{m\,:\, R^x_m=S^x_m\}. That is, R^x moves in the opposing direction to S^x until the first time when they are equal (hence the parity requirement), and then they move together. WLOG assume that r^x>0. Then suppose S^x_m=\pm N and such m is minimal. Then by construction, if m\ge T^x, then R^x_m=\pm N also. If m<T^x, then we must have S^x_m=-N, and so since R^x‘s trajectory is a mirror image of S^x‘s, in fact R^x_m = N+r^x>N, so R^x hit +N first. In both cases, we see that R^x hits \pm N at the same time or before S^x.

In other words, when S^x_m has non-negative x coordinate, the lazy random walk R^x follows the same trajectory as S^x, and when it has negative x coordinate, the R^x mirrors S^x. At some time, it may happen that S^x_m= R^x_m=0 (recall the parity condition on r). Call this time T^x. We then adjust the description of the coupling so that (*) is the mechanism for m<T^x, and then for m\ge T^x, we take S^x_m=R^x_m.

Similarly, if S_{m+1}-S_m =(0,1), (0,-1), ie S moves up or down, then we set

R_{m+1}-R_m = \begin{cases} -(S_{m+1}-S_m)&\quad \text{ if }m<T^y\\  S_{m+1}-S_m&\quad \text{if }m\le T^y,\end{cases}

with corresponding definition of the stopping time T^y.

This completes the coupling, and by considering T^x\wedge T^y, we have shown what that the exit time for the walk started from zero dominates the exit time for walk started from r. Recall that so far we are in the case where the box has even width and r=(r^x,r^y) has even coordinates.

This exit time comparison isn’t exactly what we need to compare G^N(0,0) and G^N(x,x). It’s worth remarking at this stage that if all we cared about was the Green’s function on the integer line [-N,N], we would have an easier argument, as by the harmonic property of G(\cdot,y)

G^{[-N,N]}(0,r)=\frac{N-r}{N}G^{[-N,N]}(0,0),

G^{[-N,N]}(r,0) = \frac{N}{N+r}G^{[-N,N]}(r,r),

and so G(0,0)>G(r,r) follows by symmetry. To lift from 1D to 2D directly, we need a bit more than this. It’s possible that S returns in both x- and y- coordinates more often than R, but never at the same time. Fortunately, the coupling we defined slightly earlier does give us a bit more control.

Let \tau^x(S), \tau^x(R) be the first times that S^x, R^x hit \pm N. Under this coupling, for any m\ge 0

\mathbb{P}(S^x_m=0, m<T^x) = \mathbb{P}(R^x_m=r^x, m<T^x)

since these events are literally equal. Since we showed that \tau^x(R)\le \tau^x(S) almost surely, we can further deduce

\mathbb{P}(S^x_m=0,m<T^x\wedge \tau^x(S)) \ge \mathbb{P}(S^x_m=0,m<T^x\wedge \tau^x(R))

=\mathbb{P}(R^x_m=r^x, m <T^x \wedge \tau^x(R)).

To address the corresponding events for which m\ge T^x, we apply the strong Markov property at T^x, to obtain SRW Z_m started from r/2, and let \tau_{-N},\tau_{+N} be the hitting times of -N,+N respectively and \tau_{\pm N}=\tau_{-N}\wedge \tau_{+N}. It will now suffice to prove that

\mathbb{P}(Z_m=0, m< \tau_{\pm N}) \ge \mathbb{P}(Z_m=r,m<\tau_{\pm N}), (**)

as then we can apply the law of total probability and sum over values of T^x and m\ge 0.

To prove this result, we consider the following bijection between trajectories of length m from r/2 to {0,r}. We decompose the trajectories into excursions away from r/2, and then a final meander from r/2 to {0,r} that stays on the same side of r/2. We construct the new trajectory by preserving all the initial excursions, but reversing all the steps of the final meander. So if the original trajectory ended up at 0, the image ends up at r. Trivially, the initial excursions in the image only hit \pm N if the excursions in the original trajectory did this too. But it’s also easy to see, by a similar argument to the coupling at the start of this section, that if the original trajectory ends at r and does not hit \pm N, then so does the image. However, the converse is not true. So we conclude (**), and thus

\mathbb{P}(S_m^x=0) \ge \mathbb{P}(R_m^x=0)

for all m by combining everything we have seen so far. And so we can now lift to a statement about S_m itself, that is considering both coordinates separately.

 

The remaining cases for r require a little more care over the definition of T^x, though the same projection argument works, for fundamentally the same reason. (Note that in the above argument, if S^x_m=-N and m<T^x, then in fact R^x_m\ge N+2, and so it’s not hard to convince yourself that a sensible adjustment to the stopping time will allow a corresponding result with R^x_m\ge N+1 in the odd r^x case.) The case for N odd is harder, since in one dimension there are two median sites, and it’s clear by symmetry that we can’t couple them such that RW from one always exits at least as early as RW from the other. However, the distributions of exit times started from these two sites are the same (by symmetry), and so although we can’t find a coupling, we can use similar stopping times to obtain a result in probability.

In the next post, we’ll see how to apply this uniform bound on G^{V_N}(x,x) to control the maximum of the DGFF on V_N. In particular, we address how the positive correlations of DGFF influence the behaviour of the maximum by comparison with independent Gaussians at each site.

Advertisements

DGFF 3 – Gibbs-Markov property for entropic repulsion

In the previous post, we saw that it isn’t much extra effort to define the DGFF with non-zero boundary conditions, by adding onto the zero-BC DGFF the unique (deterministic) harmonic function which extends the boundary values into the domain. We also saw how a Gibbs-Markov property applies, whereby the values taken by the field on some sub-region A\subset D depend on the values taken on D\backslash A only through values taken on \partial A.

In this post, we look at how this property and some other methods are applied by Deuschel [1] to study the probability that the DGFF on a large box in \mathbb{Z}^d is positive ‘everywhere’. This event can be interpreted in a couple of ways, all of which are referred to there as entropic repulsion. Everything which follows is either taken directly or paraphrased directly from [1]. I have tried to phrase this in a way which avoids repeating most of the calculations, instead focusing on the methods and the motivation for using them.

Fix dimension d\ge 2 throughout. We let P^0_N be the law of the DGFF on V_N:=[-N,N]^d\subset \mathbb{Z}^d with zero boundary conditions. Then for any subset A\subset \mathbb{Z}^d, in an intuitively-clear abuse of notation, we let

\Omega^+(A):= \{ h_x\ge 0, x\in A\},

be the event that some random field h takes only non-negative values on A. The goal is to determine P^0_N ( \Omega^+(V_N)). But for the purposes of this post, we will focus on showing bounds on the probability that the field is non-negative on a thin annulus near the boundary of V_N, since this is a self-contained step in the argument which contains a blog-friendly number of ideas.

We set (L_N) to be a sequence of integers greater than one (to avoid dividing by zero in the statement), for which \frac{L_N}{N}\rightarrow 0. We now define for each N, the annulus

W_N = \{v\in V_N: L_N\le d_{\mathbb{Z}^d}(v, V_N^c)\le 2L_N \}

with radius L_N set a distance L_N inside the box V_N. We aim to control P^N_0 (\Omega^+(W_N)). This forms middle steps of Deuschel’s Propositions 2.5 and 2.9, which discuss P^N_0(\Omega^+(V_{N-L_N})). Clearly there is the upper bound

P^N_0(\Omega^+(V_{N-L_N})) \le P^N_0(\Omega^+(W_N)) (1)

and a lower bound on P^N_0(\Omega^+(V_{N-L_N})) is obtained in the second proposition by considering the box as a union of annuli then combining the bounds on each annulus using the FKG inequality.

Upper bound via odds and evens

After removing step (1), this is Proposition 2.5:

\limsup_{N\rightarrow \infty} \frac{L_N}{N^{d-1} \log L_N} \log P^N_0(\Omega^+(W_N)) < 0. (2)

This is giving a limiting upper bound on the probability of the form L_N^{-CN^{d-1}/L_N}, though as with all LDP estimates, the form given at (2) is more instructive.

Morally, the reason why it is unlikely that the field should be non-negative everywhere within the annulus is that the distribution at each location is centred, and even though any pair of values are positively correlated, this correlation is not strong enough to avoid this event being unlikely. But this is hard to corral into an upper bound argument directly. In many circumstances, we want to prove upper bounds for complicated multivariate systems by projecting to get an unlikely event for a one-dimensional random variable, or a family of independent variables, even if we have to throw away some probability. We have plenty of tools for tail probabilities in both of these settings. Since the DGFF is normal, a one-dimensional RV that is a linear combination (eg the sum) of all the field heights is a natural candidate. But in this case we would have thrown away too much probability, since the only way we could dominate is to demand that the sum \sum_{x\in W_N}h^N_x\ge 0, which obviously has probability 1/2 by symmetry. (3)

So Deuschel splits W_N into W_N^o,W_N^e, where the former includes all vertices with odd total parity in W_N and the latter includes all the vertices with even total parity in the interior of W_N. (Recall that \mathbb{Z}^d is bipartite in exactly this fashion). The idea is to condition on h^N\big|_{W^o_N}. But obviously each even vertex is exactly surrounded by odd vertices. So by the Gibbs-Markov property, conditional on the odd vertices, the values of the field at the even vertices are independent. Indeed, if for each v\in W_N^e we define \bar h_v to be the average of its neighbours (which is measurable w.r.t to the sigma-algebra generated by the odd vertices), then

\{h_v: v\in W_N^e \,\big|\, \sigma(h_w: w\in W_N^o)\},

is a collection of independent normals with variance one, and where the mean of h_v is \bar h_v.

To start finding bounds, we fix some threshold m=m_N\gg 1 to be determined later, and consider the odd-measurable event A_N that at most half of the even vertices v have \bar h_v\ge m. So A_N^c\cap \Omega^+(W_N) says that all the odd vertices are non-negative and many are quite large. This certainly feels like a low-probability event, and unlike at (3), we might be able to obtain good tail bounds by projection into one dimension.

In the other case, conditional on A_N, there are a large number of even vertices with conditional mean at most m, and so we can control the probability that at least one is negative as a product

(1-\varphi(m))^{\frac12 |W_N^e|}. (4)

Note that for this upper bound, we can completely ignore the other even vertices (those with conditional mean greater than m).

So we’ll go back to A_N^c \cap \Omega^+(W_N). For computations, the easiest one-dimensional variable to work with is probably the mean of the \bar h_vs across v\in W_N^e, since on A_N^c\cap \Omega^+(W_N) this is at least \frac{m}{2}. Rather than focus on the calculations themselves involving

\bar S^e_N:= \frac{1}{|W_N^e|} \sum\limits_{v\in W_N^e} \bar h_v,

let us remark that it is certainly normal and centered, and so there are many methods to bound its tail, for example

P^0_N \left( \bar S^e_N \ge \frac{m}{2} \right) \le \exp\left( \frac{-m^2}{8\mathrm{Var}(\bar S^e_N)} \right), (5)

as used by Deuschel just follows from an easy comparison argument within the integral of the pdf. We can tackle the variance using the Green’s function for the random walk (recall the first post in this set). But before that, it’s worth making an observation which is general and useful, namely that \bar S^e_N is the expectation of

S^e_N:= \sum{1}{|W_N^e|}\sum\limits_{v\in W_N^e} h_v

conditional on the odds. Directly from the law of total variance, the variance of any random variable X is always larger than the variance of \mathbb{E}[X|Y].

So in this case, we can replace \mathrm{Var}(\bar S^e_N) in (5) with \mathrm{Var}(S^e_N), which can be controlled via the Green’s function calculation.

Finally, we choose m_N so that the probability at (4) matches the probability at (5) in scale, and this choice leads directly to (2).

In summary, we decomposed the event that everything is non-negative into two parts: either there are lots of unlikely local events in the field between an even vertex and its odd neighbours, or the field has to be atypically large at the odd sites. Tuning the parameter m_N allows us to control both of these probabilities in the sense required.

Lower bound via a sparse sub-lattice

To get a lower bound on the probability that the field is non-negative on the annulus, we need to exploit the positive correlations in the field. We use a similar idea to the upper bound. If we know the field is positive and fairly large in many places, then it is increasingly likely that it is positive everywhere. The question is how many places to choose?

We are going to consider a sub-lattice that lives in a slightly larger region than W_N itself, and condition the field to be larger than m=m_N everywhere on this lattice. We want the lattice to be sparse enough that even if we ignore positive correlations, the chance of this happening is not too small. But we also want the lattice to be dense enough that, conditional on this event, the chance that the field is actually non-negative everywhere in W_N is not too small either.

To achieve this, Deuschel chooses a sub-lattice of width \lfloor\epsilon L_N^{2/d}\rfloor, and sets \Lambda_N(\epsilon) to be the intersection of this with the annulus with radii [N-\frac{5}{2}L_N, N-\frac{1}{2}L_N], to ensure it lives in a slightly larger region than W_N itself. The scaling of this sub-lattice density is such that when a random walk is started at any v\in W_N, the probability that the RW hits \Lambda_N(\epsilon) before \partial V_N is asymptotically in (0,1). (Ie, not asymptotically zero or one – this requires some definitely non-trivial calculations.) In particular, for appropriate (ie large enough) choice of \epsilon, this probability is at least 1/2 for all v\in W_N. This means that after conditioning on event B_N:=\{h_v\ge m : v\in \Lambda_N(\epsilon)\}, the conditional expectation of h_w is at least \frac{m}{2} for all w\in W_N\backslash \Lambda_N(\epsilon). Again this uses the Gibbs-Markov property and the Gaussian nature of the field. In particular, this conditioning means we are left with the DGFF on V_N\backslash \Lambda_N(\epsilon), ie with boundary \partial V_N\cup \Lambda_N(\epsilon), and then by linearity, the mean at non-boundary points is given by the harmonic extension, which is linear (and so increasing) in the boundary values.

At this point, the route through the calculations is fairly clear. Since we are aiming for a lower bound on the probability of the event \Omega^+(W_N), it’s enough to find a lower bound on P^0_N(\Omega^+(W_N)\cap B).

Now, by positive correlation (or, formally, the FKG inequality) we can control P^0_N(B) just as a product of the probabilities that the field exceeds the threshold at each individual site in \Lambda_N(\epsilon). Since the value of the field at each site is normal with variance at least 1 (by definition), this is straightforward.

Finally, we treat P^0_N(\Omega^+(W_N) \,\big|\, B). We’ve established that, conditional on B, the mean at each point of W_N\backslash \Lambda_N(\epsilon) is at least \frac{m}{2}, and we can bound the variance above too. Again, this is a conditional variance, and so is at most the corresponding original variance, which is bounded above by \sigma_N^2:=\mathrm{Var}(h^N_0). (This fact that the variance is maximised at the centre is intuitively clear when phrased in terms of occupation times, but the proof is non-obvious, or at least non-obvious to me.)

Since each of the event h_v^N\ge 0 for v\in W_N\backslash \Lambda_N(\epsilon) is positively correlated with B, we can bound the probability it holds for all v by the product of the probabilities that it holds for each v. But having established that the conditional mean is at least \frac{m_N}{2} for each v, and the variance is uniformly bounded above (including in N), this gives an easy tail bound of the form we require.

Again it just remains to choose the sequence of thresholds m_N to maximise the lower bound on the probability that we’ve found in this way. In both cases, it turns out that taking m_N= \sqrt{C\log N} is sensible, and this turns out to be linked to the scaling of the maximum of the DGFF, which we will explore in the future.

References

[1] – J-D Deuschel, Entropic Repulsion of the Lattice Free Field, II. The 0-Boundary Case. Available at ProjectEuclid.

DGFF 1 – The discrete Gaussian free field from scratch

I’ve moved to Haifa in northern Israel to start a post-doc in the probability group at the Technion, and now that my thesis is finished I want to start blogging again. The past couple of weeks have been occupied with finding an apartment and learning about the Discrete Gaussian Free Field. All questions about the apartment are solved, but fortunately lots remain open about the DGFF, so I thought I’d write some background about this object and methods which have been used to study it.

Background – Random walk bridge

When we think of a random walk, we usually think of the index as time, normally going forwards. So for a random walk bridge, we might assume Z_0=0, and then condition on Z_N=0, thinking of this as a demand that the process has returned to zero at the future time. In some applications, this is the ideal intuition, but in others, it is more useful to think of the random walk bridge

(0=Z_0,Z_1,\ldots,Z_{N-1},Z_N=0),

as a random height function indexed by [0,N], where the probability of a given path decomposes naturally into a product depending on the N increments, up to a normalising constant.

Naturally, we are interested in the asymptotic behaviour of such a random walk bridge when N\rightarrow\infty. So long as the step distribution has finite variance, a conditioned version of Donsker’s theorem shows that the rescaled random walk bridge converges in distribution to Brownian bridge. Note that Brownian bridge

(B^{\mathrm{br}}_t, t\in[0,1])

can be constructed either by conditioning a standard Brownian motion B to return to zero at time one (modulo some technicalities – this event has zero probability), or by applying an appropriate (random) linear shift

B^{\mathrm{br}}(t):= B(t) - tB(1). (*)

It is not too hard to calculate the distribution of B^{\mathrm{br}}(t) for each t\in[0,1], and with a bit more work, one can calculate the joint distribution of (B^{\mathrm{br}}(s),B^{\mathrm{br}}(t)). In particular, the joint distribution is multivariate Gaussian, and so everything depends on the covariance ‘matrix’ (which here is indexed by [0,1]).

So if we return to a random walk bridge what should the step distribution be? Simple symmetric RW is a natural choice, as then lots of the quantities we might want to consider boil down to combinatorial calculations. Cleverness and Stirling’s formula can often get us useful asymptotics. But there are lots of inconveniences, not least the requirement to be careful about parity (N has to be even for a start unless you make the walk lazy, in which case the combinatorics becomes harder), and even if these can be overcome in a given calculation, it would be better not to have this.

The claim is that the random walk with Gaussian increments is by far the easiest to analyse asymptotically. As a further heuristic, think about the statement of the central limit theorem in the case where the underlying distribution is normal: it’s true but obvious. [Indeed, it’s my favourite piece of advice to anyone taking second year probability exams to check that your proposed statement of CLT does actually work for N(\mu,\sigma^2)…] More concretely, if a RW has Gaussian increments, then the path (Z_1,\ldots,Z_N) is a multivariate normal, or a Gaussian process with finite index set. In particular, covariances define the distribution. It remains a Gaussian process after conditioning on Z_N=0, and the linear tilting argument at (*) remains true here, and can indeed be applied to turn any boundary conditions into any other boundary conditions.

The discrete Gaussian free field

We know how to generalise the domain of a random walk to higher dimensions. But what generalising the index to higher dimension? So now there is definitely no arrow of time, and the notion of a random height function above \mathbb{Z}^2 (or a subset of it) is helpful, for which a scaling limit might be a random surface rather than Brownian motion.

Because we can’t well-order \mathbb{Z}^d, it’s harder to define any such random object on the entire lattice immediately, so we start with compact connected subsets, with zero boundary conditions, as in the one-dimensional case of random walk bridge. Formally, let D be a finite subset of \mathbb{Z}^d, and the boundary \partial D those elements of D^c which are adjacent to an element of D, and let \bar D:= D\cup \partial D.

Then, the discrete Gaussian free field on D is a random real vector h^D=(h^D_x: x\in \bar D), with probability density proportional to

\mathbf{1}\{h^D_x=0, x\in\partial D\}\exp\left ( - \frac{1}{4d} \sum_{x\sim y}(h^D_x - h^D_y)^2 \right), (1)

where we write x\sim y if that x,y are adjacent in \bar D. We won’t at any stage worry much about the partition function which normalises this pdf. Note also that \frac{1}{4d} is just a convenient choice of constant, which corresponds to one of the canonical choices for the discrete Laplacian. Adjusting this constant is the same as uniformly rescaling the values taken by the field.

The immediate interpretation of (1) is that the values taken by the field at vertices which are close to each other are positively correlated. Furthermore, the form of the density is Gaussian. Concretely, if the values of h^D are fixed everywhere except one vertex x\in D, then the conditional distribution of h^D_x is Gaussian. Later, or in subsequent posts, we will heavily develop this idea. Alternatively, we could if we really wanted describe the model in terms of independent Gaussians describing the ‘increment’ along each edge in D (which we should direct), subject to a very large number of conditions, namely that the sum of increments along any directed cycle is zero. This latter description might be more useful if you wanted to define a DGFF on a more sparse graph, but won’t be useful in what follows.

Note that we can rearrange the Laplacian in (1) in terms of the transition kernel p( ) of the simple random walk of D to obtain

\exp\left( -\frac12 (h^D)^T (\mathbf{P}-\mathbf{1})h^D \right),

where P_{x,y}=p(y-x) is the transition matrix of SRW on D. In particular, this means that the free field is Gaussian, and we can extract the covariances via

\mathrm{Cov}(h^D_x,h^D_y) = \left[ (\mathbf{1}-\mathbf{P})^{-1}\right]_{x,y}

= \left[\sum_{n\ge 0} \mathbf{P}^n\right]_{x,y} = \sum_{n\ge 0} \mathbb{P}_x\left[X_n=y,\tau_{\partial D}>n\right],

where, under \mathbb{P}_x, (X_0,X_1,\ldots) is simple random walk started from x.

This final quantity records the expected number of visits to y before leaving the domain D, for a random walk started at x, and is called the Green’s function.

In summary, the DGFF on D is the centred Gaussian random vector indexed by \bar D with covariance given by the Green’s function G_D(x,y).

How many of these equivalences carries over to more general D-indexed random fields is discussed in the survey paper by Velenik. But it’s worth emphasising that having the covariance given by the Green’s function as in the definition we’ve just given is a very nice property, as there are lots of pre-existing tools for calculating these. By contrast, it’s hard to think of a natural model for an integer-valued surface of this kind, as an analogue to SRW.

[Though definitely not impossible. The nicest example I’ve heard of is for height functions of large uniform domino tilings within their ‘arctic circle’, which have GFF asymptotics. See this paper by Kenyon.]

A continuous limit?

We motivated the discussion of random walk bridge by the limit object, namely Brownian bridge. Part of the reason why the DGFF is more interesting than Gaussian random walk bridge, is that the limit object, the (continuum) Gaussian free field is hard to define classically in two dimensions.

We might suppose that the DGFF in V_N, the square box of width N has some scaling limit as N\rightarrow\infty. However, for fixed x,y\in [0,1]^2, (and taking integer parts component-wise), well-known asymptotics for SRW in a large square lattice (more on this soon hopefully) assert that

\mathrm{Cov}(h^{V_N}_{\lfloor Nx \rfloor},h^{V_N}_{\lfloor Ny\rfloor}) \sim \log |x-y|, (2)

and so any scaling limit will rescale only the square domain, not the height (since there is no N on the RHS of (2)). However, then the variance of the proposed limit is infinite everywhere.

So the GFF does not exist as a random height function on [0,1]^2, with the consequence that a) more care is needed over its abstract definition; b) the DGFF in 2D on a large square is an interesting object, since it does exist in this sense.

What makes it ‘free’?

This seemed like a natural question to ask, but I’ve received various answers. Some sources seem to suggest that having zero boundary condition is free. Other sources refer to the Hamiltonian (that is the term inside the exponential function at (1) ) as free since it depends only on the increments between values. If the Hamiltonian also depends on the heights themselves, for example via the addition of a \sum_{x} \Psi(h^D_x) term, then for suitable choice of function \Psi, this is interpreted as a model where the particles have mass. The physical interpretation of these more general Gibbs measures is discussed widely, and I’m not very comfortable with it all at the moment, but aim to come back to it later, when hopefully I will be more comfortable.

SLE Revision 4: The Gaussian Free Field and SLE4

I couldn’t resist breaking the order of my revision notes in order that the title might be self-referential. Anyway, it’s the night before my exam on Conformal Invariance and Randomness, and I’m practising writing this in case of an essay question about the Gaussian Free Field and its relation to the SLE objects discussed in the course.

What is a Gaussian Free Field?

The most natural definition is too technical for this context. Instead, recall that we could very informally consider a Poisson random measure to have the form of a series of Poisson random variables placed at each point in the domain, weighted infinitissimely so that the integrals over an area give a Poisson random variable with mean proportional to the measure of the area, and so that different areas are independent. Here we do a similar thing only for infinitesimal centred Gaussians. We have to specify the covariance structure.

We define the Green’s function on a domain D, which has a resonance with PDE theory, by:

G_D(x,y)=\lim_{\epsilon\rightarrow 0}\mathbb{E}[\text{time spent in }B(y,\epsilon)\text{ by BM started at }x\text{ stopped at }T_D]

We want the covariance structure of the hypothetical infinitesimal Gaussians to be given by \mathbb{E}(g(x)g(y))=G_D(x,y). So formally, we define (\Gamma(A),A\subset D) for open A, by (\Gamma(A_1),\ldots,\Gamma(A_n)) a centred Gaussian RV with covariance \mathbb{E}(\Gamma(A_1)\Gamma(A_2))=\int_{A_1\times A_2}dxdyG_D(x,y).

The good news is that we have a nice expression G_U(0,x)=\log\frac{1}{|x|}, and the Green’s functions are conformally invariant in the sense that G_{\phi(D)}(\phi(x),\phi(y))=G_D(x,y), following directly for conformality of Brownian Motion.

The bad news is that the existence is not clear. The motivation for this is the following though. We have a so-called excursion measure for BMs in a domain D. There isn’t time to discuss this now: it is infinite, and invariant under translations of the boundary (assuming the boundary is \mathbb{R}\subset \bar{\mathbb{H}}, which is fine after taking a conformal map). Then take a Poisson Point Process on the set of Brownian excursions with this measure. Now define a function f on the boundary of the domain dD, and define \Gamma_f(A) to be the sum of the values of f at the starting point of BMs in the PPP passing through A, weighted by the time spent in A. We have a universality relation given by the central limit theorem: if we define h to be (in a point limit) the expected value of this variable, and we take n independent copies, we have:

\frac{1}{\sqrt{n}}\left['\Gamma_f^1(A)+\ldots+\Gamma_f^n(A)-n\int_Ah(x)dx\right]\rightarrow\Gamma(A)

where this limiting random variable is Gaussian.

For now though, we assume existence without full proof.

SLE_4

We consider chordal SLE_k, which has the form of a curve \gamma[0,\infty] from 0 to \infty in H. The g_t the regularising function as normal, consider \tilde{X}_t=X_t-W_t:=g_t(x)-\sqrt{\kappa}\beta_t for some fixed x. We are interested in the  evolution of the function arg x. Note that conditional on the (almost sure for K<=4) event that x does not lie on the curve, arg x will converge either to 0 or pi almost surely, depending on whether the curve passes to the left or the right (respectively) of x.

By Loewner’s DE for the upper half-plane and Ito’s formula:

d\bar{X}_t=\sqrt{\kappa}d\beta_t,\quad d\log\bar{X}_t=(2-\frac{\kappa}{2})\frac{dt}{\bar{X}_t^2}+\frac{\sqrt{\kappa}}{\bar{X}_t}d\beta_t

So, when K=4, the dt terms vanish, which gives that log X is a local martingale, and so

d\theta_t=\Im(\frac{2}{\bar{X}_t}d\beta_t

is a true martingale since it is bounded. Note that

\theta_t=\mathbb{E}[\pi1(x\text{ on right of }\gamma)|\mathcal{F}_t]

Note that also:

\mathbb{P}(\text{BM started at }x\text{ hits }\gamma[0,t]\cup\mathbb{R}\text{ to the right of }\gamma(t)|\gamma[0,t])=\frac{\theta_t}{\pi} also.

SLE_4 and the Gaussian Free Field on H

We will show that this chordal SLE_4 induces a conformal Markov type of property in Gaussian Free Fields constructed on the slit-domain. Precisely, we will show that if \Gamma_T is a GFF on H_T=\mathbb{H}\backslash\gamma[0,T], then \Gamma_T+ch_T(\cdot)=\Gamma_0+ch_0(\cdot), where c is a constant to be determined, and h_t(x)=\theta_t(x) in keeping with the lecturer’s notation!

It will suffice to check that for all fixed p with compact support \Gamma_T(p)+c(h_T(p)-h_0(p)) is a centred Gaussian with variance \int dxdyG_H(x,y)p(x)p(y).

First, applying Ito and conformal invariance of the Green’s functions under the maps g_t,

dG_{H_t}(x,y)=cd[h(x),h(y)]_t

The details are not particularly illuminating, but exploit the fact that Green’s function on H has a reasonable nice form \log\left|\frac{x-\bar{y}}{x-y}\right|. We are also being extremely lax with constants, but we have plenty of freedom there.

After applying Ito and some (for now unjustified) Fubini:

dh_t(p)=\left(\int c.p(x)\Im(\frac{1}{\bar{X}_t})dx\right)d\beta_t

and so as we would have suspected (since h(x) was), this is a local martingale. We now deploy Dubins-Schwarz:

h_T(p)-h_T(0)\stackrel{d}{=}B_{\sigma(T)} for B an independent BM and

\sigma(T)=\int_0^Tdt(\int c.p(x)\Im(\frac{1}{\tilde{X}_t})dx)^2

So conditional on (h_T(p),t\in[0,T]), we want to make up the difference to \Gamma_0. Add to h_T(p)-h_0(p) an independent random variable distribution as N(0,s-\sigma(T)), where

s=\int dxdyp(x)p(y)G(x,y)\quad =:\Gamma_0(p)

Then

s-\sigma(T)=\int p(x)p(y)[G(x,y)-c\int_0^Tdt\Im(\frac{1}{X_t})\Im(\frac{1}{Y_t})]dxdy=\int p(x)p(y)G_t(x,y)dxdy as desired.

Why is this important?

This is important, or at least interesting, because we can use it to reverse engineer the SLE. Informally, we let T\rightarrow\infty in the previous result. This states that taking a GFF in the domain left by removing the whole of the SLE curve (whatever that means) then adding \pi at points on the left of the curve, which is the limit \lim_T h_T is the same as a normal GFF on the upper half plane added to the argument function. It is reasonable to conjecture that a GFF in a non-connected domain has the same structure as taking independent GFFs in each component, and this gives an interesting invariance condition on GFFs. It can also be observed (Schramm-Sheffield) that SLE_4 arises by reversing the argument – take an appropriate conditioned GFF on H and look for the interface between it being ‘large’ and ‘small’ (Obviously this is a ludicrous simplification). This interface is then, under a suitable limit, SLE_4.