ISL14 N6 – Sums of Squares of Intervals

I wasn’t able to post this at the time because of the embargo on discussing this particular question until the end of IMO 2015.

I’m currently enjoying the slow descent into summer in the delightful surroundings of Tonbridge School where we are holding our final camp to select the UK team for this year’s IMO. The mechanism for this selection is a series of four exams following the format of the IMO itself. We take many of the questions for these tests from the shortlist of questions proposed but not used at the previous year’s competition. In this post, I’m going to talk about one of these questions. Obviously, since lots of other countries do the same thing, there is an embargo on publication or public-facing discussion of these problems until after the next competition, so this won’t appear for a couple of months.

The statement of the problem is the following:

IMO Shortlist 2014, N6: Let a_1<a_2<\ldots<a_n be pairwise coprime positive integers with a_1 a prime at least n+2. On the segment I=[0,a_1a_2\ldots a_n] of the real line, mark all integers that are divisible by at least one of the numbers a_1,\ldots,a_n. These points split I into a number of smaller segments. Prove that the sum of the squares of the lengths of these segments is divisible by a_1.

I marked our students’ attempts at this problem and spoke to them about it afterwards. In particular, the official solution we were provided with, and which we photocopied and gave to our students contains some good theory and some magic. On closer inspection, the magic turns out to be unnecessary, and actually distracts a little from the theory. Anyway, this is my guided tour of the problem.

We begin with two items from the rough work that some students submitted. Someone treated the case n=2, which is not quite trivial, but not hard at all, and remarked, sensibly, that n=3 seemed to have become hard. Packaged with this was the implicit observation that for the purpose of the question posed, a_1 plays a completely different role to the other quantities. While this seems obvious when presented in isolation, it’s a good thing to say explicitly so that we avoid wasting time imagining symmetries which are definitely not present later. We’ll sometimes refer to a_1 as p, whenever we are using its primality rather than its other roles.

The second idea coming out of rough work was to consider what happens when an interval gets broken into two by a ‘new point’, whatever that means. In particular, what happens to the sum of the squares of the interval lengths modulo p? The way the student had set up this procedure suggested that the calculations would be easy in the case where the point being added was a multiple of p=a_1. This ended up being a red herring, because there’s no real reason why we would want to add the multiples of a_1 at the end.

It does, however, suggest an inductive argument, if instead you add the points which are multiples of a_n, but not of any of \{a_1,\ldots,a_{n-1}\}, because the latter would already be present. It might seem like induction is illegal, since there is the condition a_1\ge n+2 already present in the statement. However, once a_1 is fixed, if the condition holds for a particular n, then clearly it holds for all smaller n (!), and so we can use induction to build up to the value of n required in the statement.

Now we are happy this might work in principle, let’s do it in practice. We are adding all these multiples of a_n so we want to have some way to index them. One choice is to use their actual values, or the quotient by a_n, ie all the integers up to \prod_{i=1}^{n-1}a_i which are coprime to a_n. But maybe after thinking about it, there will be a more helpful way to index them.

We are only interested in non-trivial splittings of intervals. We might initially be concerned that an existing interval might be split into more than two sub-intervals by the addition of the multiples of a_n, but the ordering suggested in the statement means this can’t happen. In general, let the initial size of the interval be K, and the sizes of the two sub-intervals be x and y for reasons which will become clear. Then the change in contribution to the total sum of interval lengths squared is given by

K^2 \quad \mapsto \quad x^2+y^2= K^2 - 2xy,

using that K=x+y. So the total change when we add all the multiples of a_{n} will be the sum of 2xy over all the intervals which get divided into two.

Now we have the challenge of evaluating this sum, or at least showing that it is divisible by p. Given a multiple a of a_1, what will x be?

Well, on the left of a, the distance to the nearest multiple of a_k is the remainder r_k of a on division by a_k, essentially by definition. So x will be the minimal such remainder as k varies. Similarly, y will be \min_k (a_k-r_k). We forget about the minus sign and the factor of 2 forever, since it doesn’t make any difference unless p=2, in which case we are already done, and so we now have our sum:

\sum_{a\text{ a new multiples of }a_n} \left( \min_i(r_i)\right) \left[ \min_j(a_j-r_j) \right].

Now we return to the question of how to index the new multiples of a_n. But the question’s easier now as we have some clues as to what might be useful. Each of the summands involves the remainders modulo the smaller a_is, so can we use these as part of the indexing?

Yes, of course we can. There’s a bijection between the possible sequences of remainders, and the new multiples of a_n. This is basically the statement of the Chinese Remainder Theorem. Our index set should then be

r_1,\ldots,r_{n-1} \in [1,a_1-1]\times\ldots\times [1,a_{n-1}-1].

Remember that the fact that they are not allowed to be zero is because we only care now about new points being added.

Even now though, it’s not obvious how to compute this sum, so the natural thing to try is to invert our view of what we are summing over. That is, to consider how many of the index sequences result in particular values of x and y. Given x and y at least one, each remainder r_k must lie in [x,a_k-y], ie there are a_i-(x+y)+1 values it is allowed to take. So our guess for the number of such indices might be \prod_{i=1}^{n-1} (a_i-(x+y)+1), which is indeed what I wrote up on the board before Neel pointed out that some of the sequences I’ve allowed do not actually attain this minimum.

Fortunately this is easily fixed. If the minimum for x is not attained, we know that r_k\in[x+1,a_k-y], and similarly for y. So we can apply the inclusion-exclusion principle to depth two to say that the number we actually want is

\prod_{i=1}^{n-1} (a_i-(x+y)+1) - 2\prod_{i=1}^{n-1} (a_i-(x+y)) + \prod_{i=1}^{n-1} (a_i-(x+y)-1).

This is a polynomial in (x+y), so we’ll call in P(x+y). We’ll come back to this expression. We still need to tidy up our overall sum, in particular by deciding what bounds we want on x and y. Thinking about the intervals of width p, it is clear that the bounds should be 1\le x,y and x+y\le p. So we currently have our sum as

\sum_{1\le x,y,\,x+y\le p} xy P(x+y),.

If you’ve got to this stage, I hope you would feel that this seems really promising. You’ve got to show that some sum of polynomial-like things over a reasonably well-behaved index set is divisible by p. If you don’t know general theorems that might confirm that this is true, once you’ve got statements like this, you might well guess then verify their existence.

First though, we need to split up this xy term, since our sum is really now all about x+y, which we’ll call z.

\sum_{2\le z\le p} P(z) \sum_{x+y=z,x,y\ge 1} xy.

Once we’ve had our thought that we mainly care about polynomials mod p at this stage, we might even save ourselves some work and just observe that \sum_{x+y=z}xy is a cubic in z. This is definitely not hard to check, using the formulae for triangle numbers and sums of squares. It’s also reasonable to conclude that the value of this is 0 for z=1, so we can change the indices on the sum so it doesn’t look like we’re missing a term. So we have

\sum_{z=1}^p P(z) Q(z),

where Q is some cubic, which we could work out if we really wanted to. But what’s the degree of P? P is made of a sum of 3 polynomials whose degree is clear because they are a product of linear factors. So each of these has degree (n-1), but in fact we can clearly see that the leading term cancels, and also with a tiny bit more care that the second order terms cancel too, so P has degree at most n-3.

[Tangent: students might well ask why it is convention to take the degree of the identically zero polynomial to be -\infty rather than zero. The fact that the above statement is true even when n=2 is a good justification for this.]

[Tangent 2: In my initially erroneous evaluation of the number of remainder sequences, without the inclusion-exclusion argument, I indeed ended up with P having degree n-1. I also worked out Q as being quadratic though, so my overall necessary condition was only one off from what was required, showing that two wrongs are not as good as two rights, but marginally better than only one wrong.]

Anyway, we are taking a sum of polynomials of degree at most n \le p-2. This is the point where one might get confused if you have been specifying everything in too much detail. The fact that I’ve said nothing about the polynomials apart from their degree is predicated on the fact that I now know I only need to only their degrees for what I want, but even if you had tracked everything very explicitly, you might still guess what to do at this stage.

It’s presented as a lemma in the official solution that if you have a polynomial of degree at most p-2, then \sum_{z=1}^p P(z)\equiv 0 modulo p. There are several ways to see this, of which a proof by induction is easiest to believe, but probably least intuitive. It’s clear what goes wrong for degree p-1 by FLT, so you probably want to look at the set \{z^k: z\in[1,p-1]\} modulo p. It won’t necessarily be all the residues modulo p, but thinking about primitive roots, or the roots of unity makes it fairly clear why this would be true. [Essentially any argument that turns multiplication into addition in a well-behaved way clears this up.]

The most important point is that this lemma is a) easy-ish and b) a natural thing to try given where you might have got in the question at this stage. From here you’re now basically done.

I don’t think this is an easy question at all, since even with lots of guiding through the steps there are a lot of tangents one might get drawn down, especially investigating exact forms of polynomials which it turns out we only need vague information about. Nonetheless, there are lots of lessons to be learned from the argument, which definitely makes it a good question at this level in my opinion.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s