One of the questions I posed at the end of the previous post about the Central Limit Theorem was this: what is special about the normal distribution?
More precisely, for a large class of variables (those with finite variance) the limit in distribution of after a natural rescaling is distributed as N(0,1). As a starting point for investigating similar results for a more general class of underlying distributions, it is worth considering what properties we might require of a distribution if it is to appear as a limit in distribution of sums of IID RVs, rescaled if necessary.
The property required is that the distribution is stable. In the rest of the post I am going to give an informal precis of the content of the relevant chapter of Feller.
Throughout, we assume a collection of IID RVs, , with the initial sums . Then we say is stable in the broad sense if
for some deterministic parameters for every n. If in fact then we say is stable in the strict sense. I’m not sure if this division into strict and broad is still widely drawn, but anyway. One interpretation might be that a collection of distributions is stable if they form a non-trivial subspace of the vector space of random variables and also form a subgroup under the operation of adding independent RVs. I’m not sure that this is hugely useful either though. One observation is that if exists and is 0, then so are all the s.
The key result to be shown is that
for some .
Relevant though the observation about means is, a more useful one is this. The stability property is retained if we replace the distribution of with the distribution of (independent copies naturally!). The behaviour of is also preserved. Now we can work with an underlying distribution that is symmetric about 0, rather than merely centred. The deduction that still holds now, whether or not X has a mean.
Now we proceed with the proof. All equalities are taken to be in distribution unless otherwise specified. By splitting into two smaller sums, we deduce that
Extending this idea, we have
Note that it is not even obvious yet that the s are increasing. To get a bit more control, we proceed as follows. Set , and express
from which we can make the deduction
So most importantly, by taking in the above, and using that X is symmetric, we can obtain an upper bound
in fact for any if we take large enough. But since
(which should in most cases be ), this implies that cannot be very close to 0. In other words, is bounded above. This is in fact regularity enough to deduce that from the Cauchy-type functional equation (*).
It remains to check that . Note that this equality case corresponds exactly to the scaling we saw for the normal distribution, in the context of the CLT. This motivates the proof. If , we will show that the variance of X is finite, so CLT applies. This gives some control over in an limit, which is plenty to ensure a contradiction.
To show the variance is finite, we use the definition of stable to check that there is a value of t such that
Now consider the event that the maximum of the s is and that the sum of the rest is non-negative. This has, by independence, exactly half the probability of the event demanding just that the maximum be bounded below, and furthermore is contained within the event with probability shown above. So if we set
we then have
So, is bounded as varies. Rescaling suitably, this gives that
This is exactly what we need to control the variance, as:
using that X is symmetric and that for the final equalities. But we know from CLT that if the variance is finite, we must have .
All that remains is to mention how stable distributions fit into the context of limits in distribution of RVs. This is little more than a definition.
We say F is in the domain of attraction of a broadly stable distribution R if
The role of is not hugely important, as a broadly stable distribution is in the domain of attraction of the corresponding strictly stable distribution.
The natural question to ask is: do the domains of attraction of stable distributions (for ) partition the space of probability distributions, or is some extra condition required?
Next time I will talk about stable distributions in a more analytic context, and in particular how a discussion of their properties is motivated by the construction of Levy processes.
- Large Deviations and the CLT (eventuallyalmosteverywhere.wordpress.com)