Parking on a ring, linear hashing

I’ve spent most of my doctorate trying to analyse how adding destructive dynamics affects the behaviour of a particular random growth process, the classical random graph. In this post I’m going to talk about another random growth process, which is slightly less natural, but for which one can show some similar qualitative properties.

The model, and the additive coalescent

Consider m places arranged in a circle, and for consistency of analogy we think of these as parking spaces. Some number n of cars will arrive one at a time. Each car will arrive at a space chosen uniformly at random. If it is empty they will park in it, otherwise they will look clockwise until they find an empty space, and park there. For now we are only interested in growth, so we assume cars never leave. We are interested in the sizes of blocks of consecutively parked cars.

The reason to consider this slightly unnatural statement is its equivalence to the problem of hashing with linear probing, apparently a key topic in computer science, which I won’t pretend that I know anything about. In any case, it’s a nice model, and it seems reasonable that it would have a basis in more realistic search algorithms.

So, how does the sequence of sizes of blocks of consecutively parked cars grow? Well, given the sequence of block sizes, it is reasonably easy to convince yourself that the order of the blocks around the circle is uniformly random, and the number of empty spaces between adjacent blocks is also uniformly random.

Assume for now that there are at least three blocks. A block of size x can merge with a block of size y with the arrival of the next car only if the blocks are adjacent, with exactly one empty space between them. The chance of this is uniform among all pairs of blocks. Now suppose this is the case, and that the block of size y lies clockwise from the block of size x. Then they will merge precisely if the next car arrives at any of the x occupied spaces in that block, or at the empty space between the pair of blocks. This has probability \frac{x+1}{m}. There’s also the opposite ordering to consider, where the block of size x lies clockwise from the other. The total probability of this merge \{x,y\}\mapsto \{x+y+1\} is therefore proportional to (x+y+2).

So the process of block sizes looks a bit like the additive coalescent, at least for large blocks. This is in contrast to the random graph process, where the sequence of component sizes behaves exactly like a multiplicative coalescent, where blocks merge at a rate proportional to the product of their sizes.

Asymptotics

As in the random graph process, it’s interesting to ask roughly how large the largest block will be in such a configuration. Pittel [3] considers the case where the number of empty places \ell = m-n \approx \beta m, for some \beta\in (0,1).

A less interesting model would be to choose the positions of the n cars uniformly at random. But then the size of a block is roughly geometric with parameter \beta, and there are \Theta(m) blocks with high probability. Relatively straightforward calculations in extreme value theory suggest that the largest block is likely to have size on the order of \log m in this setting.

Of course, the actual model is slightly more complicated, because the size of a block is self-reinforcing, since larger blocks are more likely to grow than smaller blocks. However, we can still get somewhere with naïve estimates. Let’s label the places clockwise. Then in order for there to be a block starting at 0 and stretching beyond \alpha \log m, a necessary condition is that at least \alpha \log m cars arrive at those places. The number of cars which arrive at those places is binomial, since there are n cars, and each arrives at a place chosen uniformly, and independently of the other cars. So this event corresponds to

\mathrm{Bin}(n,\frac{\alpha \log m}{m}) \ge \alpha \log m.

Then, since n\approx (1-\beta)n, this event corresponds approximately to

\mathrm{Po}((1-\beta)\alpha \log m) \ge \alpha \log m.

The probability that a Poisson RV is at least a constant multiple larger than its mean decays exponentially with the mean, hence in this case the probability is asymptotically some negative power of m, depending on the value of \alpha. But there are O(m) possible places for such a block to start, so whether we can apply a union bound usefully or not depends on whether the power of m is strictly less than -1.

Since all of this depends on \alpha, it is reasonable that everything is fine, and the largest block does have size at least \alpha \log m when \alpha is small, and very unlikely when \alpha is large. This heuristic argument fits with Pittel’s theorem. Indeed, his result shows much stronger concentration: that the fluctuations of the size of the largest block are O(1).

Critical regime and empirical processes

The following is a paraphrase of the introduction and some methods from [2].

Obviously, once m=m cars have arrived, there’s no room for manoeuvre and definitely all the places are taken in one giant block. But it’s not obvious in general what scaling for the number of gaps will give rise to giant blocks of \Theta(m) cars.

As for the random graph, we can find a process similar to the exploration process of a (random) graph which encodes much of the information we care about. Let Y_k be the number of cars which arrive at place k. So the sum of the Y_ks will be n, the total number of cars. Now consider the process

C_0=0, \ldots, C_{k+1}=C_k + Y_{k+1}-1.

A block has the property that the number of arrivals within that set of places is equal to the number of places. So every time this *empirical process* C drops below its previous running minimum, this indicates the end of a block. To make this equivalence precise, we need to be a bit careful about where we start counting. It works exactly if we start at the beginning of a block. If not, it might introduce some unwanted divisions within the first block.

What we have is a process that looks roughly like a random walk that is constrained to pass through the point (m,n-m), which is equal to (m,-l). Even if we aren’t totally precise about how this is like a random walk, we would expect to see Brownian fluctuations after rescaling. Indeed, we might expect to see a Brownian bridge added to a deterministic linear function with negative gradient. But this is only meaningful if the random part is at least as large as the deterministic part, and since the fluctuations have order \sqrt{m}, if l is much larger than this, the rescaled empirical process is essentially deterministic, so we won’t see any macroscopic excursions above the minimum.

If l is substantially smaller than \sqrt{m}, then there is no real difference between (m,-l) and (m,0), and what we see is just a Brownian bridge. At this point, where we choose to start the process is actually important. If we were to start it at the minimum of the Brownian bridge instead, we would have seen a Brownian excursion, which corresponds to one block occupying (almost) all of the places.

Unsurprisingly, the story is completed by considering \ell=\Theta(\sqrt{m}), where the rescaled empirical process looks like a slanted Brownian bridge, that is Brownian motion conditioned to pass through $(1,-\frac{\ell}{\sqrt{m})$. There isn’t an obvious fix to the question of where to start the process, but it turns out that the correct way is now adding a Brownian excursion onto the deterministic linear function with gradient - \frac{\ell}{\sqrt{m}}. It’s now reasonable that the excursions above the minimum should macroscopic.

This scaling limit works dynamically as well, where the same Brownian excursion is used for different gradients of the deterministic line, corresponding to \ell moving through the critical window m-\Theta(\sqrt{m}). Finally, a direction to Bertoin’s recent paper [1] for the model with an additional destructive property. Analogous to the forest fire, blocks of cars are removed at a rate proportional to their size (as a result, naturally, of ‘Molotov cocktails’…). Similar effects of self-organised criticality are seen when the rate of bombs is scaled appropriately.

References

[1] – Bertoin – Burning cars in a parking lot (paper / slides)

[2] – Chassaing + Louchard – Phase transition for parking blocks, Brownian excursion and coalescence (arXiv)

[3] – Pittel – Linear probing: the probable largest search time grows logarithmically with the number of records

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s