So last week I was writing an article for Betting Expert about laws of large numbers, and I was trying to produce some representations of distributions to illustrate the Weak LLN and the Central Limit Theorem. Because tossing a coin feels too simplistic, and also because the natural state space for this random variable, at least verbally, is not a subset of the reals, I decided to go for dice instead. So it’s clear what the distribution of the outcome of a single dice roll is, and with a bit of thought or a 6×6 grid, we can work out the distribution of the average of two dice rolls. But what about 100 rolls? Obviously, we need large samples to illustrate the laws of large numbers! In this post, we discuss how to calculate the distribution of the sample mean of n dice rolls.
First we observe that the total set of outcomes of n dice rolls is . The sum of the outcomes must lie between n and 6n inclusive. The distribution of the sum and the distribution of the sample mean are equivalent up to dividing by n. The final observation is that because the total number of outcomes has a nice form, we shouldn’t expect it to make any difference to the method if we calculate the probability of a given sum, or the number of configurations giving rise to that sum.
Indeed, tying in nicely with the first year probability course, we are going to use generating functions, and there is no difference in practice between the probability generating function, and the combinatorial generating function, if the underlying mechanism is a uniform choice. Well, in practice, there is a small difference, namely a factor of 6 here. The motivation for using generating functions is clear: we are considering the distribution of a sum of independent random variables. This is pretty much exactly why we bother to set up the machinery for PGFs.
Anyway, since each of {1,2,…,6} is equally likely, the GF of a single dice roll is
So, if we want the generating function of the sum of n independent dice rolls, we can obtain this by raising the above function to the power n. We obtain
Note the factor of at the beginning arises because the minimum value of the sum is n. So to work out the number of configurations giving rise to sum k, we need to evaluate the coefficient of
. We can deal with
fairly straightforwardly, but some thought it required regarding whether it’s possible to do similar job on
.
We have to engage briefly with what is meant by a binomial coefficient. Note that
is a valid definition even when x is not a positive integer, as it is simply a degree k polynomial in x. This works if x is a general positive real, and indeed if x is a general negative real. At this stage, we do need to keep k a positive integer, but that’s not a problem for our applications.
So we need to engage with how the binomial theorem works for exponents that are not positive integers. The tricky part with the standard expression as
is that the attraction of this symmetry in a and b prompts us to work in more generality than is entirely necessary to state the result. Note if we instead write
we have unwittingly described this finite sum as an infinite series. It just happens that all the binomial coefficients apart from the first (n+1) are zero. The nice thing about this definition is that it might plausibly generalise to non-integer or negative values of n. And indeed it does. I don’t want to go into the details here, but it’s just a Taylor series really, and the binomial coefficients are set up with factorials in the right places to look like a Taylor series, so it all works out.
It is also worth remarking that it follows straight from the definition of a negative binomial coefficient, that
In any case, we can rewrite our expression for the generating function of the IID sum as
By accounting for where we can gather exponents from each bracket, we can evaluate the coefficient of as
Ie, k in the sum takes values in . At least in theory, this now gives us an explicit way to calculate the distribution of the average of multiple dice rolls. We have to be wary, however, that many compilers will not be happy dealing with large binomial coefficients, as the large factorials grow extremely rapidly. An approximation using logs is likely to be more tractable for larger settings.
Anyway, I leave you with the fruits of my labours.
- Heavy-tails and world records (rigorandrelevance.wordpress.com)
- Performing sum of dice and generating the percentage with histogram (stackoverflow.com)
- Ijret – on Generating Functions of Biorthogonal Polynomials Suggested by Laguerre Polynomials (ijret.wordpress.com)
- IQ: The Central Limit Theorem (actuarialpug.wordpress.com)