This post continues the theme of revising topics in the analytic toolkit relevant to proving convergence of stochastic processes. Of particular interest is the question of how to prove that families of Markov chains might have a process scaling limit converging to a solution of some stochastic differential equation, in a generalisation of Donsker’s theorem for Brownian motion. In this post, however, we address more general aspects of convergence of stochastic processes, with particular reference to Skorohod space.

**Topological Background**

I’ve discussed Skorohod space in a previous post. For now, we focus attention on compactly supported functions, D[0,T]. Some of what follows can be extended to the infinite-time setting easily, and some requires more work. Although we can define a metric on the space of cadlag functions in lots of ways, it is more useful to think topologically, or at least with a more vague sense of metric. We say two cadlag functions are close to one another if there is a reparameterisation of the time-axis, (a function [0,T] to itself) that is uniformly close to the identity function, and when applied to one of the cadlag functions, brings it close to the other cadlag function. Heuristically, two cadlag functions are close if their large jumps are close to one another and of similar size, and if they are uniformly close elsewhere. It is worth remembering that a cadlag function on even an unbounded interval can have only countably many jumps, and only finitely many with magnitude greater than some threshold on any compact interval.

For much of the theory one would like to use, it is useful for the spaces under investigation to be separable. Recall a topological space is *separable* if there exists a countable dense subset. Note in particular that D[0,T] is not separable under the uniform metric, since we can define for each , then whenever . In particular, we have an uncountable collection of disjoint open sets given by the balls , and so the space is not countable. Similarly, is not separable. A counterexample might be given by considering functions which take the values {0,1} on the integers. Thus we have a map from , where the uniform distance between any two distinct image points is at least one, hence the open balls of radius 1/2 around each image point give the same contradiction as before. However, the Stone-Weierstrass theorem shows that C[0,T] *is* separable, as we can approximate any such function uniformly well by a polynomial, and thus uniformly well by a polynomial with rational coefficients.

In any case, it can be shown that D[0,T] is separable with respect to the natural choice of metric. It can also be shown that there is a metric which gives the same open sets (hence is a *topologically equivalent* metric) under which D[0,T] is complete, and hence a Polish space.

**Compactness in C[0,T] and D[0,T]**

We are interested in tightness of measures on D[0,T], so first we need to address compactness for sets of deterministic functions in D[0,T]. First, we consider C[0,T]. Here, the conditions for a set of functions to be compact is given by the celebrated Arzela-Ascoli theorem. We are really interested in compactness as a property of size, so we consider instead *relative compactness*. A set is relatively compact (sometimes *pre-compact*) if its closure is compact. For the existence of subsequential limits, this is identical to compactness, only now we allow the possibility of the limit point lying outside the set.

We note that the function given by is continuous, and hence *uniform boundedness* is certainly a required condition for compactness in C[0,T]. Arzela-Ascoli states that uniform boundedness plus equicontinuity is sufficient for a set of such functions to be compact. *Equicontinuity* should be thought of as uniform continuity that is uniform among all the functions in the set, rather than just within the argument of an individual particular function.

For identical reasons, we need uniform boundedness for relative compactness in D[0,T], but obviously uniform continuity won’t work as a criterion for discontinuous functions! We seek some analogue of the modulus of continuity that ignores jumps. We define

where the infimum is taken over all meshes with . Note that as , we can, if we want, place the so that large jumps of the function f take place over the boundaries between adjacent parts of the mesh. In particular, for a given cadlag function, it can be shown fairly easily that as . Then, unsurprisingly, in a similar fashion to the Arzela-Ascoli theorem, it follows that a set of functions is relatively compact if it is uniformly bounded, and

Note that this ‘modulus of continuity’ needs to decay uniformly across the set of functions, but that we do not need to choose the mesh at level uniformly across all functions. This would obviously not work, as then the functions for any sequence would not be compact, but they clearly converge in Skorohod space!

**Tightness in C[0,T] and D[0,T]**

Naturally, we are mainly interested in (probability) measures on D[0,T], and in particular conditions for tightness on this space. Recall a family of measures is *tight* if for any , there exists some compact set A such that

So, for measures on D[0,T], the sequence is tight precisely if for any , there exists and some N such that for any n>N, both

hold. In fact, the second condition controls variation sufficiently strongly, that we can replace the first condition with

Often we might be taking some sort of scaling limit of these processes in D[0,T], where the jumps become so small in the limit that we expect the limit process to be continuous, perhaps an SDE or diffusion. If we can replace by , the standard modulus of continuity, then we have the additional that any weak limit lies in C[0,T].

In general, to prove convergence of some stochastic processes, we will want to show that the processes are tight, by demonstrating the properties above, or something equivalent. Then Prohorov’s theorem (which I tend to think of as a probabilistic functional version of Bolzano-Weierstrass) asserts that the family of processes has a weak subsequential limit. Typically, one then shows that any weak subsequential limit must have the law of some particular random process. Normally this is achieved by showing some martingale property (eg for an SDE) in the limit, often by using the Skorohod representation theorem to use almost sure subsequential convergence rather than merely weak convergence. Then one argues that there is a unique process with this property and a given initial distribution. So since all weak subsequential limits are this given process, in fact the whole family has a weak limit.