# Skorohod embedding

Background

Suppose we are given a standard Brownian motion $(B_t)$, and a stopping time T. Then, so long as T satisfies one of the regularity conditions under which the Optional Stopping Theorem applies, we know that $\mathbb{E}[B_T]=0$. (See here for a less formal introduction to OST.) Furthermore, since $B_t^2-t$ is a martingale, $\mathbb{E}[B_T^2]=\mathbb{E}[T]$, so if the latter is finite, so is the former.

Now, using the strong Markov property of Brownian motion, we can come up with a sequence of stopping times $0=T_0, T_1, T_2,\ldots$ such that the increments $T_k-T_{k-1}$ are IID with the same distribution as T. Then $0,B_{T_1},B_{T_2},\ldots$ is a centered random walk. By taking T to be the hitting time of $\{-1,+1\}$, it is easy to see that we can embed simple random walk in a Brownian motion using this approach.

Embedding simple random walk in Brownian motion.

The Skorohod embedding question asks: can all centered random walks be constructed in this fashion, by stopping Brownian motion at a sequence of stopping time? With the strong Markov property, it immediately reduces the question of whether all centered finite-variance distributions X can be expressed as $B_T$ for some integrable stopping time T.

The answer to this question is yes, and much of what follows is drawn from, or at least prompted by Obloj’s survey paper which details the problem and rich history of the many approaches to its solution over the past seventy years.

Applications and related things

The relationship between random walks and Brownian motion is a rich one. Donsker’s invariance principle asserts that Brownian motion appears as the scaling limit of a random walk. Indeed, one can construct Brownian motion itself as the limit of a sequence of consistent random walks with normal increments on an increasingly dense set of times. Furthermore, random walks are martingales, and we know that continuous, local martingales can be expressed as a (stochastically) time-changed Brownian motion, from the Dubins-Schwarz theorem.

The Skorohod embedding theorem can be used to prove results about random walks with general distribution by proving the corresponding result for Brownian motion, and checking that the construction of the sequence of stopping times has the right properties to allow the result to be carried back to the original setting. It obviously also gives a coupling between a individual random walk and a Brownian motion which may be useful in some contexts, as well as a coupling between any pair of random walks. This is useful in proving results for random walks which are much easier for special cases of the distribution. For example, when the increments are Gaussian, or when there are combinatorial approaches to a problem about simple random walk. At the moment no aspect of this blog schedule is guaranteed, but I plan to talk about the law of the iterated logarithm shortly, whose proof is approachable in both of these settings, as well as for Brownian motion, and Skorohod embedding provides the route to the general proof.

At the end, we will briefly compare some other ways to couple a random walk and a Brownian motion.

One thing we could do is sample a copy of X independently from the Brownian motion, then declare $T= \tau_{X}:= \inf\{t\ge 0: B_t=X\}$, the hitting time of (random value) X. But recall that unfortunately $\tau_x$ has infinite expectation for all non-zero x, so this doesn’t fit the conditions required to use OST.

Skorohod’s original method is described in Section 3.1 of Obloj’s notes linked above. The method is roughly to pair up positive values taken by X appropriately with negative values taken by X in a clever way. If we have a positive value b and a negative value a, then $\tau_{a,b}$, the first hitting time of $\mathbb{R}\backslash (a,b)$ is integrable. Then we choose one of these positive-negative pairs according to the projection of the distribution of X onto the pairings, and let T be the hitting time of this pair of values. The probability of hitting b conditional on hitting {a,b} is easy to compute (it’s $\frac{-a}{b-a}$) so we need to have chosen our pairs so that the ‘probability’ of hitting b (ie the density) comes out right. In particular, this method has to start from continuous distributions X, and treat atoms in the distribution of X separately.

The case where the distribution X is symmetric (that is $X\stackrel{d}=-X$) is particularly clear, as then the pairs should be $(-x,x)$.

However, it feels like there is enough randomness in Brownian motion already, and subsequent authors showed that indeed it wasn’t necessary to introduce extra randomness to provide a solution.

One might ask whether it’s possible to generate the distribution on the set of pairs (as above) out of the Brownian motion itself, but independently from all the hitting times. It feels like it might be possible to make the distribution on the pairs measurable with respect to

$\mathcal{F}_{0+} = \bigcap\limits_{t>0} \mathcal{F}_t,$

the sigma-algebra of events determined by limiting behaviour as $t\rightarrow 0$ (which is independent of hitting times). But of course, unfortunately $\mathcal{F}_{0+}$ has a zero-one law, so it’s not possible to embed non-trivial distributions there.

Dubins solution

The exemplar for solutions without extra randomness is due to Dubins, shortly after Skorohod’s original argument. The idea is to express the distribution X as the almost sure limit of a martingale. We first use the hitting time of a pair of points to ‘decide’ whether we will end up positive or negative, and then given this information look at the hitting time (after this first time) of two subsequent points to ‘decide’ which of four regions of the real interval we end up in.

I’m going to use different notation to Obloj, corresponding more closely with how I ended up thinking about this method. We let

$a_+:= \mathbb{E}[X \,|\, X>0], \quad a_- := \mathbb{E}[X\,|\, X<0],$ (*)

and take $T_1 = \tau_{\{a_-,a_+\}}$. We need to check that

$\mathbb{P}\left( B_{T_1}=a_+\right) = \mathbb{P}\left(X>0\right),$

for this to have a chance of working. But we know that

$\mathbb{P}\left( B_{T_1}=a_+\right) = \frac{a_+}{a_+-a_-},$

and we can also attack the other side using (*) and the fact that $\mathbb{E}[X]=0$, using the law of total expectation:

$0=\mathbb{E}[X]=\mathbb{E}[X\,|\, X>0] \mathbb{P}(X>0) + \mathbb{E}[X\,|\,X<0]\mathbb{P}(X<0) = a_+ \mathbb{P}(X>0) + a_- \left(1-\mathbb{P}(X>0) \right),$

$\Rightarrow\quad \mathbb{P}(X>0)=\frac{a_+}{a_+-a_-}.$

Now we define

$a_{++}=\mathbb{E}[X \,|\, X>a_+],\quad a_{+-}=\mathbb{E}[X\,|\, 0

and similarly $a_{-+},a_{--}$. So then, conditional on $B_{T_1}=a_+$, we take

$T_2:= \inf_{t\ge T_1}\left\{ B_t\not\in (a_{+-},a_{++}) \right\},$

and similarly conditional on $B_{T_1}=a_-$. By an identical argument to the one we have just deployed, we have $\mathbb{E}\left[B_{T_2} \,|\,\mathcal{F}_{T_1} \right] = B_{T_1}$ almost surely. So, although the $a_{+-+}$ notation now starts to get very unwieldy, it’s clear we can keep going in this way to get a sequence of stopping times $0=T_0,T_1,T_2,\ldots$ where $B_{T_n}$ determines which of the $2^n$ regions of the real line any limit $\lim_{m\rightarrow\infty} B_{T_m}$ should lie in.

A bit of work is required to check that the almost sure limit $T_n\rightarrow T$ is almost surely finite, but once we have this, it is clear that $B_{T_n}\rightarrow B_T$ almost surely, and $B_T$ has the distribution required.

We want to know how close we can make this coupling between a centered random walk with variance 1, and a standard Brownian motion. Here, ‘close’ means uniformly close in probability. For large times, the typical difference between one of the stopping times $0,T_1,T_2,\ldots$ in the Skorohod embedding and its expectation (recall $\mathbb{E}[T_k]=k$) is $\sqrt{n}$. So, constructing the random walk $S_0,S_1,S_2,\ldots$ from the Brownian motion via Skorohod embedding leads to

$\left |S_k - B_k \right| = \omega(n^{1/4}),$

for most values of $k\le n$. Strassen (1966) shows that the true scale of the maximum

$\max_{k\le n} \left| S_k - B_k \right|$

is slightly larger than this, with some extra powers of $\log n$ and $\log\log n$ as one would expect.

The Komlos-Major-Tusnady coupling is a way to do a lot better than this, in the setting where the distribution of the increments has a finite MGF near 0. Then, there exists a coupling of the random walk and the Brownian motion such that

$\max_{k\le n}\left|S_k- B_k\right| = O(\log n).$

That is, there exists C such that

$\left[\max_{k\le n} \left |S_k-B_k\right| - C\log n\right] \vee 0$

is a tight family of distributions, indeed with uniform exponential tail. To avoid digressing infinitely far from my original plan to discuss the proof of the law of iterated logarithm for general distributions, I’ll stop here. I found it hard to find much coverage of the KMT result apart from the challenging original paper, and many versions expressed in the language of empirical processes, which are similar to random walks in many ways relevant to convergence and this coupling, but not for Skorohod embedding. So, here is a link to some slides from a talk by Chatterjee which I found helpful in getting a sense of the history, and some of the modern approaches to this type of normal approximation problem.

# DGFF 3 – Gibbs-Markov property for entropic repulsion

In the previous post, we saw that it isn’t much extra effort to define the DGFF with non-zero boundary conditions, by adding onto the zero-BC DGFF the unique (deterministic) harmonic function which extends the boundary values into the domain. We also saw how a Gibbs-Markov property applies, whereby the values taken by the field on some sub-region $A\subset D$ depend on the values taken on $D\backslash A$ only through values taken on $\partial A$.

In this post, we look at how this property and some other methods are applied by Deuschel [1] to study the probability that the DGFF on a large box in $\mathbb{Z}^d$ is positive ‘everywhere’. This event can be interpreted in a couple of ways, all of which are referred to there as entropic repulsion. Everything which follows is either taken directly or paraphrased directly from [1]. I have tried to phrase this in a way which avoids repeating most of the calculations, instead focusing on the methods and the motivation for using them.

Fix dimension $d\ge 2$ throughout. We let $P^0_N$ be the law of the DGFF on $V_N:=[-N,N]^d\subset \mathbb{Z}^d$ with zero boundary conditions. Then for any subset $A\subset \mathbb{Z}^d$, in an intuitively-clear abuse of notation, we let

$\Omega^+(A):= \{ h_x\ge 0, x\in A\},$

be the event that some random field h takes only non-negative values on A. The goal is to determine $P^0_N ( \Omega^+(V_N))$. But for the purposes of this post, we will focus on showing bounds on the probability that the field is non-negative on a thin annulus near the boundary of $V_N$, since this is a self-contained step in the argument which contains a blog-friendly number of ideas.

We set $(L_N)$ to be a sequence of integers greater than one (to avoid dividing by zero in the statement), for which $\frac{L_N}{N}\rightarrow 0$. We now define for each N, the annulus

$W_N = \{v\in V_N: L_N\le d_{\mathbb{Z}^d}(v, V_N^c)\le 2L_N \}$

with radius $L_N$ set a distance $L_N$ inside the box $V_N$. We aim to control $P^N_0 (\Omega^+(W_N))$. This forms middle steps of Deuschel’s Propositions 2.5 and 2.9, which discuss $P^N_0(\Omega^+(V_{N-L_N}))$. Clearly there is the upper bound

$P^N_0(\Omega^+(V_{N-L_N})) \le P^N_0(\Omega^+(W_N))$ (1)

and a lower bound on $P^N_0(\Omega^+(V_{N-L_N}))$ is obtained in the second proposition by considering the box as a union of annuli then combining the bounds on each annulus using the FKG inequality.

Upper bound via odds and evens

After removing step (1), this is Proposition 2.5:

$\limsup_{N\rightarrow \infty} \frac{L_N}{N^{d-1} \log L_N} \log P^N_0(\Omega^+(W_N)) < 0.$ (2)

This is giving a limiting upper bound on the probability of the form $L_N^{-CN^{d-1}/L_N}$, though as with all LDP estimates, the form given at (2) is more instructive.

Morally, the reason why it is unlikely that the field should be non-negative everywhere within the annulus is that the distribution at each location is centred, and even though any pair of values are positively correlated, this correlation is not strong enough to avoid this event being unlikely. But this is hard to corral into an upper bound argument directly. In many circumstances, we want to prove upper bounds for complicated multivariate systems by projecting to get an unlikely event for a one-dimensional random variable, or a family of independent variables, even if we have to throw away some probability. We have plenty of tools for tail probabilities in both of these settings. Since the DGFF is normal, a one-dimensional RV that is a linear combination (eg the sum) of all the field heights is a natural candidate. But in this case we would have thrown away too much probability, since the only way we could dominate is to demand that the sum $\sum_{x\in W_N}h^N_x\ge 0$, which obviously has probability 1/2 by symmetry. (3)

So Deuschel splits $W_N$ into $W_N^o,W_N^e$, where the former includes all vertices with odd total parity in $W_N$ and the latter includes all the vertices with even total parity in the interior of $W_N$. (Recall that $\mathbb{Z}^d$ is bipartite in exactly this fashion). The idea is to condition on $h^N\big|_{W^o_N}$. But obviously each even vertex is exactly surrounded by odd vertices. So by the Gibbs-Markov property, conditional on the odd vertices, the values of the field at the even vertices are independent. Indeed, if for each $v\in W_N^e$ we define $\bar h_v$ to be the average of its neighbours (which is measurable w.r.t to the sigma-algebra generated by the odd vertices), then

$\{h_v: v\in W_N^e \,\big|\, \sigma(h_w: w\in W_N^o)\},$

is a collection of independent normals with variance one, and where the mean of $h_v$ is $\bar h_v$.

To start finding bounds, we fix some threshold $m=m_N\gg 1$ to be determined later, and consider the odd-measurable event $A_N$ that at most half of the even vertices v have $\bar h_v\ge m$. So $A_N^c\cap \Omega^+(W_N)$ says that all the odd vertices are non-negative and many are quite large. This certainly feels like a low-probability event, and unlike at (3), we might be able to obtain good tail bounds by projection into one dimension.

In the other case, conditional on $A_N$, there are a large number of even vertices with conditional mean at most m, and so we can control the probability that at least one is negative as a product

$(1-\varphi(m))^{\frac12 |W_N^e|}$. (4)

Note that for this upper bound, we can completely ignore the other even vertices (those with conditional mean greater than m).

So we’ll go back to $A_N^c \cap \Omega^+(W_N)$. For computations, the easiest one-dimensional variable to work with is probably the mean of the $\bar h_v$s across $v\in W_N^e$, since on $A_N^c\cap \Omega^+(W_N)$ this is at least $\frac{m}{2}$. Rather than focus on the calculations themselves involving

$\bar S^e_N:= \frac{1}{|W_N^e|} \sum\limits_{v\in W_N^e} \bar h_v,$

let us remark that it is certainly normal and centered, and so there are many methods to bound its tail, for example

$P^0_N \left( \bar S^e_N \ge \frac{m}{2} \right) \le \exp\left( \frac{-m^2}{8\mathrm{Var}(\bar S^e_N)} \right),$ (5)

as used by Deuschel just follows from an easy comparison argument within the integral of the pdf. We can tackle the variance using the Green’s function for the random walk (recall the first post in this set). But before that, it’s worth making an observation which is general and useful, namely that $\bar S^e_N$ is the expectation of

$S^e_N:= \sum{1}{|W_N^e|}\sum\limits_{v\in W_N^e} h_v$

conditional on the odds. Directly from the law of total variance, the variance of any random variable X is always larger than the variance of $\mathbb{E}[X|Y]$.

So in this case, we can replace $\mathrm{Var}(\bar S^e_N)$ in (5) with $\mathrm{Var}(S^e_N)$, which can be controlled via the Green’s function calculation.

Finally, we choose $m_N$ so that the probability at (4) matches the probability at (5) in scale, and this choice leads directly to (2).

In summary, we decomposed the event that everything is non-negative into two parts: either there are lots of unlikely local events in the field between an even vertex and its odd neighbours, or the field has to be atypically large at the odd sites. Tuning the parameter $m_N$ allows us to control both of these probabilities in the sense required.

Lower bound via a sparse sub-lattice

To get a lower bound on the probability that the field is non-negative on the annulus, we need to exploit the positive correlations in the field. We use a similar idea to the upper bound. If we know the field is positive and fairly large in many places, then it is increasingly likely that it is positive everywhere. The question is how many places to choose?

We are going to consider a sub-lattice that lives in a slightly larger region than $W_N$ itself, and condition the field to be larger than $m=m_N$ everywhere on this lattice. We want the lattice to be sparse enough that even if we ignore positive correlations, the chance of this happening is not too small. But we also want the lattice to be dense enough that, conditional on this event, the chance that the field is actually non-negative everywhere in $W_N$ is not too small either.

To achieve this, Deuschel chooses a sub-lattice of width $\lfloor\epsilon L_N^{2/d}\rfloor$, and sets $\Lambda_N(\epsilon)$ to be the intersection of this with the annulus with radii $[N-\frac{5}{2}L_N, N-\frac{1}{2}L_N]$, to ensure it lives in a slightly larger region than $W_N$ itself. The scaling of this sub-lattice density is such that when a random walk is started at any $v\in W_N$, the probability that the RW hits $\Lambda_N(\epsilon)$ before $\partial V_N$ is asymptotically in (0,1). (Ie, not asymptotically zero or one – this requires some definitely non-trivial calculations.) In particular, for appropriate (ie large enough) choice of $\epsilon$, this probability is at least 1/2 for all $v\in W_N$. This means that after conditioning on event $B_N:=\{h_v\ge m : v\in \Lambda_N(\epsilon)\}$, the conditional expectation of $h_w$ is at least $\frac{m}{2}$ for all $w\in W_N\backslash \Lambda_N(\epsilon)$. Again this uses the Gibbs-Markov property and the Gaussian nature of the field. In particular, this conditioning means we are left with the DGFF on $V_N\backslash \Lambda_N(\epsilon)$, ie with boundary $\partial V_N\cup \Lambda_N(\epsilon)$, and then by linearity, the mean at non-boundary points is given by the harmonic extension, which is linear (and so increasing) in the boundary values.

At this point, the route through the calculations is fairly clear. Since we are aiming for a lower bound on the probability of the event $\Omega^+(W_N)$, it’s enough to find a lower bound on $P^0_N(\Omega^+(W_N)\cap B)$.

Now, by positive correlation (or, formally, the FKG inequality) we can control $P^0_N(B)$ just as a product of the probabilities that the field exceeds the threshold at each individual site in $\Lambda_N(\epsilon)$. Since the value of the field at each site is normal with variance at least 1 (by definition), this is straightforward.

Finally, we treat $P^0_N(\Omega^+(W_N) \,\big|\, B)$. We’ve established that, conditional on B, the mean at each point of $W_N\backslash \Lambda_N(\epsilon)$ is at least $\frac{m}{2}$, and we can bound the variance above too. Again, this is a conditional variance, and so is at most the corresponding original variance, which is bounded above by $\sigma_N^2:=\mathrm{Var}(h^N_0)$. (This fact that the variance is maximised at the centre is intuitively clear when phrased in terms of occupation times, but the proof is non-obvious, or at least non-obvious to me.)

Since each of the event $h_v^N\ge 0$ for $v\in W_N\backslash \Lambda_N(\epsilon)$ is positively correlated with B, we can bound the probability it holds for all v by the product of the probabilities that it holds for each v. But having established that the conditional mean is at least $\frac{m_N}{2}$ for each v, and the variance is uniformly bounded above (including in N), this gives an easy tail bound of the form we require.

Again it just remains to choose the sequence of thresholds $m_N$ to maximise the lower bound on the probability that we’ve found in this way. In both cases, it turns out that taking $m_N= \sqrt{C\log N}$ is sensible, and this turns out to be linked to the scaling of the maximum of the DGFF, which we will explore in the future.

References

[1] – J-D Deuschel, Entropic Repulsion of the Lattice Free Field, II. The 0-Boundary Case. Available at ProjectEuclid.

# DGFF 1 – The discrete Gaussian free field from scratch

I’ve moved to Haifa in northern Israel to start a post-doc in the probability group at the Technion, and now that my thesis is finished I want to start blogging again. The past couple of weeks have been occupied with finding an apartment and learning about the Discrete Gaussian Free Field. All questions about the apartment are solved, but fortunately lots remain open about the DGFF, so I thought I’d write some background about this object and methods which have been used to study it.

Background – Random walk bridge

When we think of a random walk, we usually think of the index as time, normally going forwards. So for a random walk bridge, we might assume $Z_0=0$, and then condition on $Z_N=0$, thinking of this as a demand that the process has returned to zero at the future time. In some applications, this is the ideal intuition, but in others, it is more useful to think of the random walk bridge

$(0=Z_0,Z_1,\ldots,Z_{N-1},Z_N=0),$

as a random height function indexed by [0,N], where the probability of a given path decomposes naturally into a product depending on the N increments, up to a normalising constant.

Naturally, we are interested in the asymptotic behaviour of such a random walk bridge when $N\rightarrow\infty$. So long as the step distribution has finite variance, a conditioned version of Donsker’s theorem shows that the rescaled random walk bridge converges in distribution to Brownian bridge. Note that Brownian bridge

$(B^{\mathrm{br}}_t, t\in[0,1])$

can be constructed either by conditioning a standard Brownian motion B to return to zero at time one (modulo some technicalities – this event has zero probability), or by applying an appropriate (random) linear shift

$B^{\mathrm{br}}(t):= B(t) - tB(1).$ (*)

It is not too hard to calculate the distribution of $B^{\mathrm{br}}(t)$ for each $t\in[0,1]$, and with a bit more work, one can calculate the joint distribution of $(B^{\mathrm{br}}(s),B^{\mathrm{br}}(t))$. In particular, the joint distribution is multivariate Gaussian, and so everything depends on the covariance ‘matrix’ (which here is indexed by [0,1]).

So if we return to a random walk bridge what should the step distribution be? Simple symmetric RW is a natural choice, as then lots of the quantities we might want to consider boil down to combinatorial calculations. Cleverness and Stirling’s formula can often get us useful asymptotics. But there are lots of inconveniences, not least the requirement to be careful about parity (N has to be even for a start unless you make the walk lazy, in which case the combinatorics becomes harder), and even if these can be overcome in a given calculation, it would be better not to have this.

The claim is that the random walk with Gaussian increments is by far the easiest to analyse asymptotically. As a further heuristic, think about the statement of the central limit theorem in the case where the underlying distribution is normal: it’s true but obvious. [Indeed, it’s my favourite piece of advice to anyone taking second year probability exams to check that your proposed statement of CLT does actually work for $N(\mu,\sigma^2)$…] More concretely, if a RW has Gaussian increments, then the path $(Z_1,\ldots,Z_N)$ is a multivariate normal, or a Gaussian process with finite index set. In particular, covariances define the distribution. It remains a Gaussian process after conditioning on $Z_N=0$, and the linear tilting argument at (*) remains true here, and can indeed be applied to turn any boundary conditions into any other boundary conditions.

The discrete Gaussian free field

We know how to generalise the domain of a random walk to higher dimensions. But what generalising the index to higher dimension? So now there is definitely no arrow of time, and the notion of a random height function above $\mathbb{Z}^2$ (or a subset of it) is helpful, for which a scaling limit might be a random surface rather than Brownian motion.

Because we can’t well-order $\mathbb{Z}^d$, it’s harder to define any such random object on the entire lattice immediately, so we start with compact connected subsets, with zero boundary conditions, as in the one-dimensional case of random walk bridge. Formally, let D be a finite subset of $\mathbb{Z}^d$, and the boundary $\partial D$ those elements of $D^c$ which are adjacent to an element of D, and let $\bar D:= D\cup \partial D$.

Then, the discrete Gaussian free field on D is a random real vector $h^D=(h^D_x: x\in \bar D)$, with probability density proportional to

$\mathbf{1}\{h^D_x=0, x\in\partial D\}\exp\left ( - \frac{1}{4d} \sum_{x\sim y}(h^D_x - h^D_y)^2 \right),$ (1)

where we write $x\sim y$ if that x,y are adjacent in $\bar D$. We won’t at any stage worry much about the partition function which normalises this pdf. Note also that $\frac{1}{4d}$ is just a convenient choice of constant, which corresponds to one of the canonical choices for the discrete Laplacian. Adjusting this constant is the same as uniformly rescaling the values taken by the field.

The immediate interpretation of (1) is that the values taken by the field at vertices which are close to each other are positively correlated. Furthermore, the form of the density is Gaussian. Concretely, if the values of $h^D$ are fixed everywhere except one vertex $x\in D$, then the conditional distribution of $h^D_x$ is Gaussian. Later, or in subsequent posts, we will heavily develop this idea. Alternatively, we could if we really wanted describe the model in terms of independent Gaussians describing the ‘increment’ along each edge in D (which we should direct), subject to a very large number of conditions, namely that the sum of increments along any directed cycle is zero. This latter description might be more useful if you wanted to define a DGFF on a more sparse graph, but won’t be useful in what follows.

Note that we can rearrange the Laplacian in (1) in terms of the transition kernel p( ) of the simple random walk of D to obtain

$\exp\left( -\frac12 (h^D)^T (\mathbf{P}-\mathbf{1})h^D \right),$

where $P_{x,y}=p(y-x)$ is the transition matrix of SRW on D. In particular, this means that the free field is Gaussian, and we can extract the covariances via

$\mathrm{Cov}(h^D_x,h^D_y) = \left[ (\mathbf{1}-\mathbf{P})^{-1}\right]_{x,y}$

$= \left[\sum_{n\ge 0} \mathbf{P}^n\right]_{x,y} = \sum_{n\ge 0} \mathbb{P}_x\left[X_n=y,\tau_{\partial D}>n\right],$

where, under $\mathbb{P}_x$, $(X_0,X_1,\ldots)$ is simple random walk started from x.

This final quantity records the expected number of visits to y before leaving the domain D, for a random walk started at x, and is called the Green’s function.

In summary, the DGFF on D is the centred Gaussian random vector indexed by $\bar D$ with covariance given by the Green’s function $G_D(x,y)$.

How many of these equivalences carries over to more general D-indexed random fields is discussed in the survey paper by Velenik. But it’s worth emphasising that having the covariance given by the Green’s function as in the definition we’ve just given is a very nice property, as there are lots of pre-existing tools for calculating these. By contrast, it’s hard to think of a natural model for an integer-valued surface of this kind, as an analogue to SRW.

[Though definitely not impossible. The nicest example I’ve heard of is for height functions of large uniform domino tilings within their ‘arctic circle’, which have GFF asymptotics. See this paper by Kenyon.]

A continuous limit?

We motivated the discussion of random walk bridge by the limit object, namely Brownian bridge. Part of the reason why the DGFF is more interesting than Gaussian random walk bridge, is that the limit object, the (continuum) Gaussian free field is hard to define classically in two dimensions.

We might suppose that the DGFF in $V_N$, the square box of width N has some scaling limit as $N\rightarrow\infty$. However, for fixed $x,y\in [0,1]^2$, (and taking integer parts component-wise), well-known asymptotics for SRW in a large square lattice (more on this soon hopefully) assert that

$\mathrm{Cov}(h^{V_N}_{\lfloor Nx \rfloor},h^{V_N}_{\lfloor Ny\rfloor}) \sim \log |x-y|,$ (2)

and so any scaling limit will rescale only the square domain, not the height (since there is no N on the RHS of (2)). However, then the variance of the proposed limit is infinite everywhere.

So the GFF does not exist as a random height function on $[0,1]^2$, with the consequence that a) more care is needed over its abstract definition; b) the DGFF in 2D on a large square is an interesting object, since it does exist in this sense.

What makes it ‘free’?

This seemed like a natural question to ask, but I’ve received various answers. Some sources seem to suggest that having zero boundary condition is free. Other sources refer to the Hamiltonian (that is the term inside the exponential function at (1) ) as free since it depends only on the increments between values. If the Hamiltonian also depends on the heights themselves, for example via the addition of a $\sum_{x} \Psi(h^D_x)$ term, then for suitable choice of function $\Psi$, this is interpreted as a model where the particles have mass. The physical interpretation of these more general Gibbs measures is discussed widely, and I’m not very comfortable with it all at the moment, but aim to come back to it later, when hopefully I will be more comfortable.

# SLE Revision 4: The Gaussian Free Field and SLE4

I couldn’t resist breaking the order of my revision notes in order that the title might be self-referential. Anyway, it’s the night before my exam on Conformal Invariance and Randomness, and I’m practising writing this in case of an essay question about the Gaussian Free Field and its relation to the SLE objects discussed in the course.

What is a Gaussian Free Field?

The most natural definition is too technical for this context. Instead, recall that we could very informally consider a Poisson random measure to have the form of a series of Poisson random variables placed at each point in the domain, weighted infinitissimely so that the integrals over an area give a Poisson random variable with mean proportional to the measure of the area, and so that different areas are independent. Here we do a similar thing only for infinitesimal centred Gaussians. We have to specify the covariance structure.

We define the Green’s function on a domain D, which has a resonance with PDE theory, by:

$G_D(x,y)=\lim_{\epsilon\rightarrow 0}\mathbb{E}[\text{time spent in }B(y,\epsilon)\text{ by BM started at }x\text{ stopped at }T_D]$

We want the covariance structure of the hypothetical infinitesimal Gaussians to be given by $\mathbb{E}(g(x)g(y))=G_D(x,y)$. So formally, we define $(\Gamma(A),A\subset D)$ for open A, by $(\Gamma(A_1),\ldots,\Gamma(A_n))$ a centred Gaussian RV with covariance $\mathbb{E}(\Gamma(A_1)\Gamma(A_2))=\int_{A_1\times A_2}dxdyG_D(x,y)$.

The good news is that we have a nice expression $G_U(0,x)=\log\frac{1}{|x|}$, and the Green’s functions are conformally invariant in the sense that $G_{\phi(D)}(\phi(x),\phi(y))=G_D(x,y)$, following directly for conformality of Brownian Motion.

The bad news is that the existence is not clear. The motivation for this is the following though. We have a so-called excursion measure for BMs in a domain D. There isn’t time to discuss this now: it is infinite, and invariant under translations of the boundary (assuming the boundary is $\mathbb{R}\subset \bar{\mathbb{H}}$, which is fine after taking a conformal map). Then take a Poisson Point Process on the set of Brownian excursions with this measure. Now define a function f on the boundary of the domain dD, and define $\Gamma_f(A)$ to be the sum of the values of f at the starting point of BMs in the PPP passing through A, weighted by the time spent in A. We have a universality relation given by the central limit theorem: if we define h to be (in a point limit) the expected value of this variable, and we take n independent copies, we have:

$\frac{1}{\sqrt{n}}\left['\Gamma_f^1(A)+\ldots+\Gamma_f^n(A)-n\int_Ah(x)dx\right]\rightarrow\Gamma(A)$

where this limiting random variable is Gaussian.

For now though, we assume existence without full proof.

SLE_4

We consider chordal SLE_k, which has the form of a curve $\gamma[0,\infty]$ from 0 to $\infty$ in H. The g_t the regularising function as normal, consider $\tilde{X}_t=X_t-W_t:=g_t(x)-\sqrt{\kappa}\beta_t$ for some fixed x. We are interested in the  evolution of the function arg x. Note that conditional on the (almost sure for K<=4) event that x does not lie on the curve, arg x will converge either to 0 or pi almost surely, depending on whether the curve passes to the left or the right (respectively) of x.

By Loewner’s DE for the upper half-plane and Ito’s formula:

$d\bar{X}_t=\sqrt{\kappa}d\beta_t,\quad d\log\bar{X}_t=(2-\frac{\kappa}{2})\frac{dt}{\bar{X}_t^2}+\frac{\sqrt{\kappa}}{\bar{X}_t}d\beta_t$

So, when K=4, the dt terms vanish, which gives that log X is a local martingale, and so

$d\theta_t=\Im(\frac{2}{\bar{X}_t}d\beta_t$

is a true martingale since it is bounded. Note that

$\theta_t=\mathbb{E}[\pi1(x\text{ on right of }\gamma)|\mathcal{F}_t]$

Note that also:

$\mathbb{P}(\text{BM started at }x\text{ hits }\gamma[0,t]\cup\mathbb{R}\text{ to the right of }\gamma(t)|\gamma[0,t])=\frac{\theta_t}{\pi}$ also.

SLE_4 and the Gaussian Free Field on H

We will show that this chordal SLE_4 induces a conformal Markov type of property in Gaussian Free Fields constructed on the slit-domain. Precisely, we will show that if $\Gamma_T$ is a GFF on $H_T=\mathbb{H}\backslash\gamma[0,T]$, then $\Gamma_T+ch_T(\cdot)=\Gamma_0+ch_0(\cdot)$, where c is a constant to be determined, and $h_t(x)=\theta_t(x)$ in keeping with the lecturer’s notation!

It will suffice to check that for all fixed p with compact support $\Gamma_T(p)+c(h_T(p)-h_0(p))$ is a centred Gaussian with variance $\int dxdyG_H(x,y)p(x)p(y)$.

First, applying Ito and conformal invariance of the Green’s functions under the maps g_t,

$dG_{H_t}(x,y)=cd[h(x),h(y)]_t$

The details are not particularly illuminating, but exploit the fact that Green’s function on H has a reasonable nice form $\log\left|\frac{x-\bar{y}}{x-y}\right|$. We are also being extremely lax with constants, but we have plenty of freedom there.

After applying Ito and some (for now unjustified) Fubini:

$dh_t(p)=\left(\int c.p(x)\Im(\frac{1}{\bar{X}_t})dx\right)d\beta_t$

and so as we would have suspected (since h(x) was), this is a local martingale. We now deploy Dubins-Schwarz:

$h_T(p)-h_T(0)\stackrel{d}{=}B_{\sigma(T)}$ for B an independent BM and

$\sigma(T)=\int_0^Tdt(\int c.p(x)\Im(\frac{1}{\tilde{X}_t})dx)^2$

So conditional on $(h_T(p),t\in[0,T])$, we want to make up the difference to $\Gamma_0$. Add to $h_T(p)-h_0(p)$ an independent random variable distribution as $N(0,s-\sigma(T))$, where

$s=\int dxdyp(x)p(y)G(x,y)\quad =:\Gamma_0(p)$

Then

$s-\sigma(T)=\int p(x)p(y)[G(x,y)-c\int_0^Tdt\Im(\frac{1}{X_t})\Im(\frac{1}{Y_t})]dxdy=\int p(x)p(y)G_t(x,y)dxdy$ as desired.

Why is this important?

This is important, or at least interesting, because we can use it to reverse engineer the SLE. Informally, we let $T\rightarrow\infty$ in the previous result. This states that taking a GFF in the domain left by removing the whole of the SLE curve (whatever that means) then adding $\pi$ at points on the left of the curve, which is the limit $\lim_T h_T$ is the same as a normal GFF on the upper half plane added to the argument function. It is reasonable to conjecture that a GFF in a non-connected domain has the same structure as taking independent GFFs in each component, and this gives an interesting invariance condition on GFFs. It can also be observed (Schramm-Sheffield) that SLE_4 arises by reversing the argument – take an appropriate conditioned GFF on H and look for the interface between it being ‘large’ and ‘small’ (Obviously this is a ludicrous simplification). This interface is then, under a suitable limit, SLE_4.