RMM 2017 – Problems 2, 3 and 6

In the previous post, I discussed Problems 1, 4 and 5 from this year’s Romanian Master of Mathematics competition. In this post, I discuss the harder problems (modulo my subjective appreciation of difficulty).

Problem 2

Determine all positive integers n satisfying the following condition: for every monic polynomial P of degree at most n with integer coefficients, there exists a positive integer k \leq n, and (k+1) distinct integers x_1,\ldots,x_{k+1} such that

P(x_1) + P(x_2) + \cdots + P(x_k) = P(x_{k+1}).

Parsing this question deserve at least a moment. Straight after a first reading, I find it worth writing down any key quantifiers which I might forget later. Here, it’s the words at most. If you want to show the statement holds for n=2, you need to investigate monic polynomials with degree zero, one and two. You should also make sure that any instances of x_i really are always distinct.

This matters in competitions! Two of our contestants failed to get the mark for showing n=2 works, precisely because of not checking the linear case, and a third could have lost it for using examples which are sometimes not distinct. On hard papers, one mark actually is the difference between triumph and frustration. And of course it matters outside competitions too, since small cases are exactly what your reader might examine first, to check they understand the problem posed, so it’s not a good place for awkward errors.

I started by trying to show that it couldn’t possibly happen that every polynomial with degree at most n had this property, for some combinatorial reason. For example, that if every set of distinct integers could only be a solution set for a small number of polynomials, then we would end up with not enough polynomials. But I couldn’t make this work at all; every bound ended up heavily in the wrong direction.

The next natural question is, does a typical polynomial of degree at most n have this property? But choosing a typical polynomial is hard, so in fact I asked, do the simplest polynomials of degree at most n have this property? I think the simplest polynomials of degree at most n are \{1,x,x^2,\ldots,x^n\}. Under what circumstances does

x_1^m + \ldots x_k^m = x_{k+1}^m, (1)

have solutions in distinct integers? Famously, when k=2 and m\ge 3 this is a very very hard problem indeed. So the first point is that it though it might be useful to use Fermat’s Last Theorem, it would be foolish to pursue a strategy which, if successful, would have a proof of FLT as a sub-problem. At least, it would be foolish if the aim was to finish this strategy within a few hours.

So my main comment on this question is meta-mathematical. If lots of attempts at general arguments don’t work, there must be some special example that does it. And what properties do I want this special example to have? Maybe one might have thought of this from scratch, but my motivation came from (1) in the case m=p-1. Then, by Fermat’s Little Theorem, all the summands are equal to 1 or 0 modulo p. If k>p, then after discounting any uniform factors of p, we obtain a congruence equation which is, in informal terms,

\left(0\text{ or }1\right)+\ldots+\left(0\text{ or }1\right) \equiv \left(0\text{ or }1\right).

This looks really promising because it’s quite restrictive, but it’s still just a bit annoying: there are quite a few solutions. But it does give us the right idea, which is to find a polynomial P for which P(x)\equiv 1 modulo n. The equation 1+\ldots+1\equiv 1 modulo n has solutions only if the number of summands on the LHS is 1 modulo n. So in this context, this reduces to showing that P is, additionally, injective on the integers, ie that P(x)=P(y) only when x=y.

It’s a nice exercise to show the existence of polynomials which are constant modulo n, and a good problem to work out how to force injectivity. If a polynomial is increasing everywhere, then it is certainly injective, and so the problem ends up being slightly easier in the case where the degree is odd than when the degree is even, but this is a nice conclusion to a nice problem, so I’ll save it for any interested readers to finish themselves.

Problem 3

Let n be an integer greater than 1 and let X be an n-element set. A non-empty collection of subsets A_1,\ldots, A_k of X is tight if the union A_1 \cup \dots \cup A_k is a proper subset of X and no element of X lies in exactly one of the A_is. Find the largest cardinality of a collection of proper non-empty subsets of X, no non-empty subcollection of which is tight.

Note. A subset A of X is proper if A\neq X. The sets in a collection are assumed to be distinct. The whole collection is assumed to be a subcollection.

By Neel Nanda:

If |X|=n, there are 2^n possible subsets, so at first glance the answer could be a variety of things, from a linear to an exponential function of n, each of which would suggest a different approach. So the first step is to conjecture an answer, and by examining small cases it seems impossible to do better than 2n-2. There are several natural constructions for this bound, such as n subsets of size (n-1) and (n-2) subsets of size 1, so we guess this to be our answer (which later turn out to be right!).

From here, a solution is deceptively simple, though empirically the five full solutions in the contest show that it was by no means easy to find. We proceed by induction on the size of X, and want to show that any collection of subsets S has size at least (2n-2). By assumption all subcollections are not tight, so if the union of a subcollection is not the whole set X, then there is an element which appears in exactly one subset. This is a useful result, so we’d like to force a subcollection whose union is not the whole set X.

One way to guarantee that the union of a subcollection is not X is by taking the subcollection of all subsets not containing some element b. So there is some element c which appears in only one subset not containing b. If we choose b so that it’s the element contained in the fewest subsets of S, c is in at least as many subsets of S, but in only one subset not containing b. This means that at most one subset containing b doesn’t contain c. This is useful, because after removing at most 2 subsets (the coefficient of n in 2n-2, importantly!), we now have that every subset in S either contains both b and c or neither. This means that we can replace the pair (b,c) with a new element d, to get a new collection of subsets S’ of a set X’, of size n-1, so by induction |S| \le |S'|+2\le 2n-2.

There is also the case where all subsets contain b, but we can create an equivalent collection of subsets of X \ {b} by removing b from all subsets. So again by induction we are done.

Problem 6

Let ABCD be any convex quadrilateral and let P, Q, R, S be points on the segments AB, BC, CD, and DA, respectively. It is given that the segments PR and QS dissect ABCD into four quadrilaterals, each of which has perpendicular diagonals. Show that the points P, Q, R, S are concyclic.

I thought this problem was extremely hard. The official solution starts with a ‘magic lemma’, that isn’t quite so magic if you then read how it’s used. The overall claim is that PQ, RS and AC are concurrent (or parallel), and this is proved using the fact that the radical axis of the two circles with diameters PQ and RS also passes through this point of concurrency. Hunting for key properties of subsets of points in the diagram is an important skill in hard olympiad geometry, since it exactly reflects how problem-setters produce the problems. All the more so when there is lots of symmetry in the construction. But this is a hard example – there are a lot of potentially relevant subsets of the configuration.

When you’re really stuck with how to get involved in a synthetic configuration, you might consider using coordinates. Some of the UK students have been reading some chapters of a book (Euclidean Geometry in Mathematical Olympiads by Evan Chen. I’ve only had my own copy for a couple of days, but my initial impression is very positive – it fills a gap in the literature in a style that’s both comprehensive and readable.) focusing on various analytic approaches, so James and I felt it was safer to make sure we knew what the best settings were, and how far we could take them.

You almost certainly want the intersection of PR and QS to be your origin. I wanted to set up the configuration using the language of vectors, referenced by (P,Q,R,S). This was because PQ\perp BO and so on, hence \mathbf{b}\cdot (\mathbf{q}-\mathbf{p})=0 and so on. An alternative is to use complex numbers, which makes this condition a bit more awkward, but is more promising for the conclusion. Concyclity is not a natural property in vectors unless you can characterise the centre of the circle, but can be treated via cross-ratios in \mathbb{C}. You also have to decide whether to describe the collinearity of A, B and P by expressing \mathbf{p}=\lambda_{\mathbf{p}} \mathbf{a}+(1-\lambda_{\mathbf{p}})\mathbf{b}, or via something more implicit. There definitely are not four degrees of freedom here, since specifying A certainly defines at most one valid set of (B,C,D), so one is mindful we’ll have to eliminate many variables later. We also have to account for fact that \mathbf{r} is a negative scalar multiple of \mathbf{p}, and it’s not clear whether it’s better to break symmetry immediately, or use this towards the end of a calculation.

The point of writing this was that if your initial thought was ‘this looks promising via coordinate methods’, then I guess I agree. But there’s a difference between looking promising, and actually working, and there are lots of parameterisation options. It’s certainly worth thinking very carefully about which to choose, and in this case, challenging though they were, the synthetic or synthetic-trigonometric methods probably were better.

ISL14 N6 – Sums of Squares of Intervals

I wasn’t able to post this at the time because of the embargo on discussing this particular question until the end of IMO 2015.

I’m currently enjoying the slow descent into summer in the delightful surroundings of Tonbridge School where we are holding our final camp to select the UK team for this year’s IMO. The mechanism for this selection is a series of four exams following the format of the IMO itself. We take many of the questions for these tests from the shortlist of questions proposed but not used at the previous year’s competition. In this post, I’m going to talk about one of these questions. Obviously, since lots of other countries do the same thing, there is an embargo on publication or public-facing discussion of these problems until after the next competition, so this won’t appear for a couple of months.

The statement of the problem is the following:

IMO Shortlist 2014, N6: Let a_1<a_2<\ldots<a_n be pairwise coprime positive integers with a_1 a prime at least n+2. On the segment I=[0,a_1a_2\ldots a_n] of the real line, mark all integers that are divisible by at least one of the numbers a_1,\ldots,a_n. These points split I into a number of smaller segments. Prove that the sum of the squares of the lengths of these segments is divisible by a_1.

I marked our students’ attempts at this problem and spoke to them about it afterwards. In particular, the official solution we were provided with, and which we photocopied and gave to our students contains some good theory and some magic. On closer inspection, the magic turns out to be unnecessary, and actually distracts a little from the theory. Anyway, this is my guided tour of the problem.

We begin with two items from the rough work that some students submitted. Someone treated the case n=2, which is not quite trivial, but not hard at all, and remarked, sensibly, that n=3 seemed to have become hard. Packaged with this was the implicit observation that for the purpose of the question posed, a_1 plays a completely different role to the other quantities. While this seems obvious when presented in isolation, it’s a good thing to say explicitly so that we avoid wasting time imagining symmetries which are definitely not present later. We’ll sometimes refer to a_1 as p, whenever we are using its primality rather than its other roles.

The second idea coming out of rough work was to consider what happens when an interval gets broken into two by a ‘new point’, whatever that means. In particular, what happens to the sum of the squares of the interval lengths modulo p? The way the student had set up this procedure suggested that the calculations would be easy in the case where the point being added was a multiple of p=a_1. This ended up being a red herring, because there’s no real reason why we would want to add the multiples of a_1 at the end.

It does, however, suggest an inductive argument, if instead you add the points which are multiples of a_n, but not of any of \{a_1,\ldots,a_{n-1}\}, because the latter would already be present. It might seem like induction is illegal, since there is the condition a_1\ge n+2 already present in the statement. However, once a_1 is fixed, if the condition holds for a particular n, then clearly it holds for all smaller n (!), and so we can use induction to build up to the value of n required in the statement.

Now we are happy this might work in principle, let’s do it in practice. We are adding all these multiples of a_n so we want to have some way to index them. One choice is to use their actual values, or the quotient by a_n, ie all the integers up to \prod_{i=1}^{n-1}a_i which are coprime to a_n. But maybe after thinking about it, there will be a more helpful way to index them.

We are only interested in non-trivial splittings of intervals. We might initially be concerned that an existing interval might be split into more than two sub-intervals by the addition of the multiples of a_n, but the ordering suggested in the statement means this can’t happen. In general, let the initial size of the interval be K, and the sizes of the two sub-intervals be x and y for reasons which will become clear. Then the change in contribution to the total sum of interval lengths squared is given by

K^2 \quad \mapsto \quad x^2+y^2= K^2 - 2xy,

using that K=x+y. So the total change when we add all the multiples of a_{n} will be the sum of 2xy over all the intervals which get divided into two.

Now we have the challenge of evaluating this sum, or at least showing that it is divisible by p. Given a multiple a of a_1, what will x be?

Well, on the left of a, the distance to the nearest multiple of a_k is the remainder r_k of a on division by a_k, essentially by definition. So x will be the minimal such remainder as k varies. Similarly, y will be \min_k (a_k-r_k). We forget about the minus sign and the factor of 2 forever, since it doesn’t make any difference unless p=2, in which case we are already done, and so we now have our sum:

\sum_{a\text{ a new multiples of }a_n} \left( \min_i(r_i)\right) \left[ \min_j(a_j-r_j) \right].

Now we return to the question of how to index the new multiples of a_n. But the question’s easier now as we have some clues as to what might be useful. Each of the summands involves the remainders modulo the smaller a_is, so can we use these as part of the indexing?

Yes, of course we can. There’s a bijection between the possible sequences of remainders, and the new multiples of a_n. This is basically the statement of the Chinese Remainder Theorem. Our index set should then be

r_1,\ldots,r_{n-1} \in [1,a_1-1]\times\ldots\times [1,a_{n-1}-1].

Remember that the fact that they are not allowed to be zero is because we only care now about new points being added.

Even now though, it’s not obvious how to compute this sum, so the natural thing to try is to invert our view of what we are summing over. That is, to consider how many of the index sequences result in particular values of x and y. Given x and y at least one, each remainder r_k must lie in [x,a_k-y], ie there are a_i-(x+y)+1 values it is allowed to take. So our guess for the number of such indices might be \prod_{i=1}^{n-1} (a_i-(x+y)+1), which is indeed what I wrote up on the board before Neel pointed out that some of the sequences I’ve allowed do not actually attain this minimum.

Fortunately this is easily fixed. If the minimum for x is not attained, we know that r_k\in[x+1,a_k-y], and similarly for y. So we can apply the inclusion-exclusion principle to depth two to say that the number we actually want is

\prod_{i=1}^{n-1} (a_i-(x+y)+1) - 2\prod_{i=1}^{n-1} (a_i-(x+y)) + \prod_{i=1}^{n-1} (a_i-(x+y)-1).

This is a polynomial in (x+y), so we’ll call in P(x+y). We’ll come back to this expression. We still need to tidy up our overall sum, in particular by deciding what bounds we want on x and y. Thinking about the intervals of width p, it is clear that the bounds should be 1\le x,y and x+y\le p. So we currently have our sum as

\sum_{1\le x,y,\,x+y\le p} xy P(x+y),.

If you’ve got to this stage, I hope you would feel that this seems really promising. You’ve got to show that some sum of polynomial-like things over a reasonably well-behaved index set is divisible by p. If you don’t know general theorems that might confirm that this is true, once you’ve got statements like this, you might well guess then verify their existence.

First though, we need to split up this xy term, since our sum is really now all about x+y, which we’ll call z.

\sum_{2\le z\le p} P(z) \sum_{x+y=z,x,y\ge 1} xy.

Once we’ve had our thought that we mainly care about polynomials mod p at this stage, we might even save ourselves some work and just observe that \sum_{x+y=z}xy is a cubic in z. This is definitely not hard to check, using the formulae for triangle numbers and sums of squares. It’s also reasonable to conclude that the value of this is 0 for z=1, so we can change the indices on the sum so it doesn’t look like we’re missing a term. So we have

\sum_{z=1}^p P(z) Q(z),

where Q is some cubic, which we could work out if we really wanted to. But what’s the degree of P? P is made of a sum of 3 polynomials whose degree is clear because they are a product of linear factors. So each of these has degree (n-1), but in fact we can clearly see that the leading term cancels, and also with a tiny bit more care that the second order terms cancel too, so P has degree at most n-3.

[Tangent: students might well ask why it is convention to take the degree of the identically zero polynomial to be -\infty rather than zero. The fact that the above statement is true even when n=2 is a good justification for this.]

[Tangent 2: In my initially erroneous evaluation of the number of remainder sequences, without the inclusion-exclusion argument, I indeed ended up with P having degree n-1. I also worked out Q as being quadratic though, so my overall necessary condition was only one off from what was required, showing that two wrongs are not as good as two rights, but marginally better than only one wrong.]

Anyway, we are taking a sum of polynomials of degree at most n \le p-2. This is the point where one might get confused if you have been specifying everything in too much detail. The fact that I’ve said nothing about the polynomials apart from their degree is predicated on the fact that I now know I only need to only their degrees for what I want, but even if you had tracked everything very explicitly, you might still guess what to do at this stage.

It’s presented as a lemma in the official solution that if you have a polynomial of degree at most p-2, then \sum_{z=1}^p P(z)\equiv 0 modulo p. There are several ways to see this, of which a proof by induction is easiest to believe, but probably least intuitive. It’s clear what goes wrong for degree p-1 by FLT, so you probably want to look at the set \{z^k: z\in[1,p-1]\} modulo p. It won’t necessarily be all the residues modulo p, but thinking about primitive roots, or the roots of unity makes it fairly clear why this would be true. [Essentially any argument that turns multiplication into addition in a well-behaved way clears this up.]

The most important point is that this lemma is a) easy-ish and b) a natural thing to try given where you might have got in the question at this stage. From here you’re now basically done.

I don’t think this is an easy question at all, since even with lots of guiding through the steps there are a lot of tangents one might get drawn down, especially investigating exact forms of polynomials which it turns out we only need vague information about. Nonetheless, there are lots of lessons to be learned from the argument, which definitely makes it a good question at this level in my opinion.