# Coalescence 1: What is it, and why do we care?

As part of Part III, instead of sitting an extra exam paper I am writing an essay. I have chosen the topic of ‘Multiplicative Coalescence’. I want to avoid contravening plagiarism rules, which don’t allow you to quote your own words without a proper citation, which I figure is tricky on a blog, nor open publishing of anything you intend to submit. So just to be absolutely sure, I’m going to suppress this series of posts until after May 4th, when everything has to be handed in.

———–

Informal Description

Coalescence refers to a process in which particles join together over time. An example might be islands of foam on the surface of a cup of coffee. When two clumps meet, they join, and will never split. In this example, a model would need to take into account the shape of all the islands, their positions, their velocities, and boundary properties. To make things tractable, we need to distance ourselves from the idea that particles merge through collisions, which are highly physical and complicated, and instead just consider that they merge.

Description of the Model

When two particles coalesce, it is natural to assume that mass is conserved, as this will be necessary in any physical application. With this in mind, it makes sense to set up the entire model using only the masses of particles. Define the kernel K(x,y) which describes the relative rate or likelihood of the coalescence {x,y} -> x+y. This has a different precise meaning in different contexts. Effectively, we are making a mean-field assumption that all the complications of a physical model as described above can be absorbed into this coalescent kernel, either because the number of particles is large, or because the other effects are small.

When there is, initially, a finite number of particles, the process is stochastic. Coalescence almost surely happen one at a time, and so we can view the process as a continuous time Markov Chain with state space the set of relevant partitions of the total mass present. The transition rate p(A,B) is given by K(x,y) when the coalescence {x,y} -> x+y transforms partition into B, and 0 otherwise. An observation is that the process recording the number of {x,y} -> x+y coalescences is an inhomogeneous Poisson process with local intensity n(x,t)n(y,t)K(x,y) where n(x,t) is the number of particles with mass at time t.

This motivates the move to an infinite-volume setting. Suppose that there are infinitely many particles, so that coalescences are occurring continuously. The rate of {x,y} -> x+y coalescences is still n(x,t)n(y,t)K(x,y) but now n(x,t) specifies the density of particles with mass at time t. Furthermore, because of the continuum framework, this rate is now deterministic rather than stochastic. This is extremely important, as by removing the probability from a probabilistic model, it can be treated as a large but simple ODE.

Two Remarks

1) Once this introduction is finished, we shall be bringing our focus onto multiplicative coalescence, where K(x,y) = xy. In particular, this is a homogeneous function, as are the other canonical kernels. This means that considering K(x,y) = cxy is the same as performing a constant factor time-change when K(x,y) = xy. Similarly, it is not important how the density n(x,t) is scaled as this can also be absorbed with a time-change. In some contexts, it will be natural and useful to demand that the total density be 1, but this will not always be possible. In general it is convenient to absorb as much as possible into the time parameter, particularly initial conditions, as will be discussed.

2) Working with an infinite volume of particles means that mass is no longer constrained to finitely many values. Generally, it is assumed that the masses are discrete, taking values in the positive integers, or continuous, taking values in the positive reals. In this case, the rate of coalescences between particles with masses in (x, x+dx) and (y,y+dy) is n(x,t)n(y,t)K(x,y)dxdy. The main difference between these will arise when we try to view the process as limits of finite processes.

Smoluchowski’s Equations

A moment’s thought allows us to describe this deterministic process by a family of ODEs. Informally, the density of particles with mass is controlled by the rate of coalescences in which mass x particles merge with others, and the rate of coalescences which form such particles. For reasons relating to the choice of definition of K(x,y) and for convenience when considering duality (see later), convention is that the rate of coalescence of the form   {x,y} -> x+y is: $\frac12 n(x,t)n(y,t)K(x,y)$.

Very informally, the factor of 2 arises because of the equivalence of {x,y} -> x+y and {y,x} -> x+y. Smoluchowski’s equations give an ODE description of this process.

Discrete setting: $\frac{d}{dt}n(x,t)=\frac12 \sum_{y=1}^{x-1}K(y,x-y)n(y,t)n(x-y,t)-n(x,t)\sum_{y=1}^\infty K(x,y)n(y,t)$

Continuous setting: $\frac{d}{dt}n(x,t)=\frac12 \int_0^x K(y,x-y)n(y,t)n(x-y,t)dy-n(x,t)\int_0^\infty K(x,y)n(y,t)dy$

The most important consequence is that it is possible to solve these equations for many of the interesting kernels. The example of the multiplicative coalescent is particularly tractable. Properties of the analytic solutions will often carry over into the probabilistic models which approximate them.

Applications

In conclusion, we have a family of discrete stochastic processes and a readily-soluble continuum process. The exact nature of the limiting relationship needs to be further explored, but there will certainly be significant connections.

Some interesting physical systems can be modelled directly as coalescent processes. Applications to chemistry include aerosols, where liquid particles coalesce while suspended in some medium, and polymerisation. The scale is not important: galaxies and algae both behave in this way.

There are also applications to be found by viewing existing probabilistic models in a coalescent context. Models of population genetics, where clusters represent common ancestors, have shown value. Another important application has been in the field of random graph processes. The most striking feature of the Erdos Renyi process was the phase transition in the size of the maximal component. But the process recording the component sizes is very closely related to the multiplicative coalescent. Very informally, given two components with sizes and y, there are xy potential edges joining them, so the probability of such an edge being added is proportional to xy, once everything has been defined satisfactorily. As will be seen later, this equivalence will be useful in defining a canonical version of the multiplicative coalescent.