You have N counters serving customers. What is a better arrangement? To have N queues, one for each counter, as in a supermarket (in the heady days before self-checkout, when ‘something unexpected in the bagging area’ might necessitate a trip to the doctor); or a single queue feeding to counters when they become free, as at a train station, or airport passport control?
Queueing is a psychologically intense process, at least in this country. Here are some human points:
– When you have to choose your line, you might, if you have nothing better to do, spend your time in the queue worrying about whether you chose optimally; whether Fortune has given a customer who arrived after you an advantage. (*)
– A single queue feels long, but conversely gives its users the pleasing feeling of almost constant progress. A particularly difficult customer doesn’t have as great an effect.
Interestingly, we can qualify these remarks in purely mathematical terms. We shall see that the mean waiting times are the same, but that, as we might expect, the variance is less for the single queue.
So to begin, we assume that the arrival process is Poisson, and the service times are exponential. These assumptions are very much not required, but it makes it easier to set up the model as a Markov chain. We do need that
, the arrival rate, is less than
, where
is the service rate. Why? Well, otherwise the number of customers waiting to be served will almost surely grow to infinity. When supply is greater than demand, as we specify, we know that there will (almost surely) be infinitely many times when there are no customers in the system. This shows that the Markov chain is recurrent, and so it is meaningful to talk about a mean waiting time.
We assume that in the N-queues arrangement, if a counter is free but the corresponding queue is empty, then a customer from a non-zero queue will take the place if possible. We do not worry about how this customer is chosen. Indeed, we do not concern ourselves with how the customers choose a queue, except to ask that the choice algorithm is previsible. That is, they cannot look into the future and optimise (or otherwise) accordingly. This assumption is standard and reasonable by considering the real-world situation.
Now if we set things up in the right way, we can couple the two processes. Precisely, we start with no customers, and define

to be the sequence of arrival times (note that almost surely no two customers arrive at the same time). Now say that

are the times at which a customer starts being served, that is, moves from the queue to the counter. Say
are the service times of the customers, where
is the service time of the customer who leaves the queue at time
.
The important point is that service times are independent of the queueing process, and are i.i.d. This is where we use the fact that the queue choice strategies are previsible. So, in fact all we need to examine is how customers spend in the queue. Concretely, take Q the queueing time, and S the service time. These are independent so:
![\mathbb{E}[Q+S]=\mathbb{E}Q+\mathbb{E}S](https://s0.wp.com/latex.php?latex=%5Cmathbb%7BE%7D%5BQ%2BS%5D%3D%5Cmathbb%7BE%7DQ%2B%5Cmathbb%7BE%7DS&bg=ffffff&fg=333333&s=0&c=20201002)
and
.
Observe that
have the same meanings and distributions in both queue models. It is not obvious that
does. We notice that for both models, if we define
on
,
the number of the first n customers being served at time t, we have

Therefore, by induction on n, we have that
is well-defined as the same function of
for both models. This fact was given as implicitly obvious by many of the remarks on this topic which I found. In my opinion, it is no more obvious than the solution to the original problem.
We can also define, for both models, the number of customers served before the queue is empty again as:

ie, the least N for which the queue is empty just before the N+1-th customer arrives. As previously discussed, N is a.s. finite. The processes are Markov chains, the first time the queue is empty is a stopping time, and it is meaningful to consider the average waiting time of the N customers before the queue is empty again. There is an expectation over N implicit here too, but this won’t be necessary for our result.
We can now finish the proof. This mean waiting time is

for the single queue. For multiple queues, suppose the i-th customer to arrive is the
-th to be served, where
is a permutation. Then the waiting time is
,
which is clearly equal to the previous expression as we can reorder a finite sum.
For variance, we have to consider
$
Observe that for
, we have
and so
,
which we can then apply repeatedly to show that

with equality precisely when
. So we conclude that the variance is in general higher when there are N queues. And the reason is not that different to what we had originally suspected (*), namely that customers do not necessarily get served in the same order as they arrive when there are multiple queues, and so the variance of waiting time is greater.
References
This problem comes from Examples Sheet 1 of the Part III course Stochastic Networks. An official solution of the same flavour as this is also linked from the course page.