One of the questions I posed at the end of the previous post about the Central Limit Theorem was this: what is special about the normal distribution?
More precisely, for a large class of variables (those with finite variance) the limit in distribution of after a natural rescaling is distributed as N(0,1). As a starting point for investigating similar results for a more general class of underlying distributions, it is worth considering what properties we might require of a distribution if it is to appear as a limit in distribution of sums of IID RVs, rescaled if necessary.
The property required is that the distribution is stable. In the rest of the post I am going to give an informal precis of the content of the relevant chapter of Feller.
Throughout, we assume a collection of IID RVs, , with the initial sums
. Then we say
is stable in the broad sense if
for some deterministic parameters for every n. If in fact
then we say
is stable in the strict sense. I’m not sure if this division into strict and broad is still widely drawn, but anyway. One interpretation might be that a collection of distributions is stable if they form a non-trivial subspace of the vector space of random variables and also form a subgroup under the operation of adding independent RVs. I’m not sure that this is hugely useful either though. One observation is that if
exists and is 0, then so are all the
s.
The key result to be shown is that
for some
.
Relevant though the observation about means is, a more useful one is this. The stability property is retained if we replace the distribution of with the distribution of
(independent copies naturally!). The behaviour of
is also preserved. Now we can work with an underlying distribution that is symmetric about 0, rather than merely centred. The deduction that
still holds now, whether or not X has a mean.
Now we proceed with the proof. All equalities are taken to be in distribution unless otherwise specified. By splitting into two smaller sums, we deduce that
Extending this idea, we have
Note that it is not even obvious yet that the s are increasing. To get a bit more control, we proceed as follows. Set
, and express
from which we can make the deduction
(*)
So most importantly, by taking in the above, and using that X is symmetric, we can obtain an upper bound
in fact for any if we take
large enough. But since
(which should in most cases be ), this implies that
cannot be very close to 0. In other words,
is bounded above. This is in fact regularity enough to deduce that
from the Cauchy-type functional equation (*).
It remains to check that . Note that this equality case
corresponds exactly to the
scaling we saw for the normal distribution, in the context of the CLT. This motivates the proof. If
, we will show that the variance of X is finite, so CLT applies. This gives some control over
in an
limit, which is plenty to ensure a contradiction.
To show the variance is finite, we use the definition of stable to check that there is a value of t such that
Now consider the event that the maximum of the s is
and that the sum of the rest is non-negative. This has, by independence, exactly half the probability of the event demanding just that the maximum be bounded below, and furthermore is contained within the event with probability
shown above. So if we set
we then have
So, is bounded as
varies. Rescaling suitably, this gives that
This is exactly what we need to control the variance, as:
using that X is symmetric and that for the final equalities. But we know from CLT that if the variance is finite, we must have
.
All that remains is to mention how stable distributions fit into the context of limits in distribution of RVs. This is little more than a definition.
We say F is in the domain of attraction of a broadly stable distribution R if
The role of is not hugely important, as a broadly stable distribution is in the domain of attraction of the corresponding strictly stable distribution.
The natural question to ask is: do the domains of attraction of stable distributions (for ) partition the space of probability distributions, or is some extra condition required?
Next time I will talk about stable distributions in a more analytic context, and in particular how a discussion of their properties is motivated by the construction of Levy processes.
Related articles
- Large Deviations and the CLT (eventuallyalmosteverywhere.wordpress.com)
Pingback: Gaussian tail bounds and a word of caution about CLT | Eventually Almost Everywhere
Pingback: The Levy-Khintchine Formula | Eventually Almost Everywhere
Pingback: Analytic vs Probabilistic Arguments for Supercritical BP | Eventually Almost Everywhere