# The Poisson Process – Distributions of Holding Times

So, I was planning to conclude my lecture course on Markov Chains at the Cambridge-Linyi summer school with an hour devoted to the Poisson Process. Unfortunately, working through the discrete time material and examples took longer than I’d expected, so we never got round to it. As I’d put a fair bit of work into the preparation, I figured I might as well write it up here instead.

We need a plausible mathematical model for a situation where people or events arrive randomly in continuous time, in a manner that is as close as possible to the notion of iid random variables. In discrete time, a natural candidate would be a set of iid Bernoulli random variables. For example, with probability p a single bus will arrive in an time interval of a minute. With probability 1-p, no bus will arrive. We might have some logistical motivation for why it is not possible that two or more arrive in a given interval, or we could instead choose a more complicated distribution.

One way to proceed would be to specify the distribution of the times between arrivals. These should be independent and identically distributed, at least intuitively. However, although we might be able to give a sensible guess right now, it is not immediately clear what this distribution should be. For now, we merely remark that the arrival times are called $X_1,X_2,\ldots$, and the holding times between arrivals are defined by
$S_1=X_1, S_n=X_n-X_{n-1},n\geq 2$.

In fact the motivating discrete example gives us much of the machinery we will actually need. Recall that when we define probability distributions for continuously-valued random variables we need a different plan of attack than for discrete RVs. Whereas for the discrete case, it is enough to specify the probability of each outcome, for a continuous random variable, we have to specify the probabilities of intervals, and take care that they have the obvious additive and nesting properties that we want. Taking the integral (whether Riemannian or Lebesgue) of a so-called density function is a natural way to do this.

Similarly here, we build up from small time intervals. The first remark is this: it is natural that the expected number of arrivals in the first minute is equal to the expected number of arrivals in the second minute. After all, we are considering the most general process possible. If there are an infinite number of potential arriving agents, then behaviour in the first minute should be independent and equal (in distribution) to behaviour in the second minute. We can naturally extend this idea to a linear relation. If $N_s$ is the number of arrivals in the time [0,s], then we should have $\mathbb{E}N_s=\lambda s$, where $\lambda$ is some constant of proportionality, equal to $\mathbb{E}N_1$.

The key remark is that as $s\rightarrow 0$, $\mathbb{P}(N_s=1)$ becomes small, and $\mathbb{P}(N_s\geq 2)$ becomes very small. In fact it suffices that $\mathbb{P}(N_s \geq 2)=o(s)$, as this implies:

$\mathbb{P}(N_s=0)=1-\lambda s+o(s),\quad \mathbb{P}(N_s=1)=\lambda s+o(s).$

Note that we are not currently attempting a formal construction of the process. As always, finding a probability space with enough freedom to equip a continuous process is a fiddly task. We are for now just trying to work out what the distribution of the holding times between arrivals should be. There are obvious advantages to defining the process as a collection of iid random variables, for example that we can construct it on a product space.

To do this, we split the time interval [0,1] into blocks

$[0,1]=[0,\frac{1}{n}]\cup[\frac{1}{n},\frac{2}{n}]\cup\ldots\cup[\frac{n-1}{n},1]$

So the probability that someone arrives in the time $[\frac{k}{n},\frac{k+1}{n}]$ is $\frac{\lambda}{n}$. So

$\mathbb{P}(\text{no-one arrives in time }[0,1])=(1-\frac{\lambda}{n})^n\approx e^{-\lambda}.$

As we are working in an $n\rightarrow\infty$ regime, we can replace 1 by general time t, to obtain:

$\mathbb{P}(\text{no-one arrives in time }[0,t])=(1-\frac{\lambda t}{n})^n\approx e^{-\lambda t}.$

So the distribution function of the first arrival time is $F(t)=1-e^{-\lambda t}$ in the conventional notation.
Thus $X_1\sim \text{Exp}(\lambda)$.

However, to emphasis how useful the infinitissimal definition is for actual problems, consider these examples.

1) If we have two arrivals processes at the same object, for example arriving from the left and the right at the same shop, say with rates $\lambda,\mu$, then we want to show that the first arrival time is still exponential. Because of the linearity of expectation property, it is clearly from the definition that the total arrivals process is Poisson with rate $\lambda+\mu$, and so the result follows. Showing this by examining the joint distribution of two exponential random variables is also possible, but much less elegant.

2) Similarly, if we have two shops, A and B, and each arriving person chooses one at random, then the first arrival time at A is: with probability 1/2 distributed as Exp($\lambda$), with probability 1/4 as $\Gamma(2,\lambda)$, and so on. A fairly non-trivial calculation is required to show that this is the same as $Exp(\frac{1}{2}\lambda)$, whereas this follows almost instantly using the the infinitissimal definition.

Moral: with the infinitissimal case, the difficulties are all in the probability space. However, once we have settled those problems, everything else is nice as the key property is linear. Whereas for a construction by iid jump times, the existence and well-definedness is clear, but even for a distribution as tractable as the exponential random variable, manipulation can be tricky.