This is something I had to think about when writing my Part III essay, and it turns out to be relevant to some of the literature I’ve been reading this week. The main result is hugely helpful for reducing a potentially complicated combinatorial object to a finite sum of i.i.d. random variables, which in general we do know quite a lot about. I was very pleased with the proof I came up with while writing the essay, even if in the end it turned out to have appeared elsewhere before. (Citation at end)

**Galton-Watson processes**

A *Galton-Watson process* is a stochastic process describing a simple model for evolution of a population. At each stage of the evolution, a new generation is created as every member of the current generation produces some number of `offspring’ with identical and independent (both across all generations and within generations) distributions. Such processes were introduced by Galton and Watson to examine the evolution of surnames through history.

More precisely, we specify an *offspring distribution*, a probability distribution supported on . Then define a sequence of random variables by:

where is a family of i.i.d. random variables with the offspring distribution . We say is the size of the th generation. From now on, assume and then we call a Galton-Watson process. We also define the *total population size* to be

noting that this might be infinite. We refer to the situation where finite as *extinction*, and can show that extinction occurs almost surely when , excepting the trivial case . The strict inequality parts are as you would expect. We say the process is *critical *if , and this is less obvious to visualise, but works equally well in the proof, which is usually driven using generating functions.

**Total Population Size and Dwass’s Theorem**

Of particular interest is , the total population size, and its distribution. The following result gives us a precise and useful result linking the probability of the population having size and the distribution of the sum of RVs with the relevant offspring distribution. Among the consequences are that we can conclude immediately, by CLT and Cramer’s Large Deviations Theorem, that the total population size distribution has power-law decay in the critical case, and exponential decay otherwise.

**Theorem (Dwass (1)): **For a general branching process with a single time-0 ancestor and offspring distribution $Y$ and total population size $X$:

where are independent copies of .

We now give a proof via a combinatorial argument. The approach is similar to that given in (2). Much of the literature gives a proof using generating functions.

**Proof: **For motivation, consider the following. It is natural to consider a branching process as a tree, with the time-0 ancestor as the root. Suppose the event in holds, which means that the tree has vertices. Now consider the numbers of offspring of each vertex in the tree. Since every vertex except the root has exactly one parent, and there are no vertices outside the tree, we must have where are the offspring numbers in some order. However, observe that this is not sufficient. For example, if is the number of offspring of the root, and , then we must have . Continue reading