RMM 2017 – Problems 2, 3 and 6

In the previous post, I discussed Problems 1, 4 and 5 from this year’s Romanian Master of Mathematics competition. In this post, I discuss the harder problems (modulo my subjective appreciation of difficulty).

Problem 2

Determine all positive integers n satisfying the following condition: for every monic polynomial P of degree at most n with integer coefficients, there exists a positive integer k \leq n, and (k+1) distinct integers x_1,\ldots,x_{k+1} such that

P(x_1) + P(x_2) + \cdots + P(x_k) = P(x_{k+1}).

Parsing this question deserve at least a moment. Straight after a first reading, I find it worth writing down any key quantifiers which I might forget later. Here, it’s the words at most. If you want to show the statement holds for n=2, you need to investigate monic polynomials with degree zero, one and two. You should also make sure that any instances of x_i really are always distinct.

This matters in competitions! Two of our contestants failed to get the mark for showing n=2 works, precisely because of not checking the linear case, and a third could have lost it for using examples which are sometimes not distinct. On hard papers, one mark actually is the difference between triumph and frustration. And of course it matters outside competitions too, since small cases are exactly what your reader might examine first, to check they understand the problem posed, so it’s not a good place for awkward errors.

I started by trying to show that it couldn’t possibly happen that every polynomial with degree at most n had this property, for some combinatorial reason. For example, that if every set of distinct integers could only be a solution set for a small number of polynomials, then we would end up with not enough polynomials. But I couldn’t make this work at all; every bound ended up heavily in the wrong direction.

The next natural question is, does a typical polynomial of degree at most n have this property? But choosing a typical polynomial is hard, so in fact I asked, do the simplest polynomials of degree at most n have this property? I think the simplest polynomials of degree at most n are \{1,x,x^2,\ldots,x^n\}. Under what circumstances does

x_1^m + \ldots x_k^m = x_{k+1}^m, (1)

have solutions in distinct integers? Famously, when k=2 and m\ge 3 this is a very very hard problem indeed. So the first point is that it though it might be useful to use Fermat’s Last Theorem, it would be foolish to pursue a strategy which, if successful, would have a proof of FLT as a sub-problem. At least, it would be foolish if the aim was to finish this strategy within a few hours.

So my main comment on this question is meta-mathematical. If lots of attempts at general arguments don’t work, there must be some special example that does it. And what properties do I want this special example to have? Maybe one might have thought of this from scratch, but my motivation came from (1) in the case m=p-1. Then, by Fermat’s Little Theorem, all the summands are equal to 1 or 0 modulo p. If k>p, then after discounting any uniform factors of p, we obtain a congruence equation which is, in informal terms,

\left(0\text{ or }1\right)+\ldots+\left(0\text{ or }1\right) \equiv \left(0\text{ or }1\right).

This looks really promising because it’s quite restrictive, but it’s still just a bit annoying: there are quite a few solutions. But it does give us the right idea, which is to find a polynomial P for which P(x)\equiv 1 modulo n. The equation 1+\ldots+1\equiv 1 modulo n has solutions only if the number of summands on the LHS is 1 modulo n. So in this context, this reduces to showing that P is, additionally, injective on the integers, ie that P(x)=P(y) only when x=y.

It’s a nice exercise to show the existence of polynomials which are constant modulo n, and a good problem to work out how to force injectivity. If a polynomial is increasing everywhere, then it is certainly injective, and so the problem ends up being slightly easier in the case where the degree is odd than when the degree is even, but this is a nice conclusion to a nice problem, so I’ll save it for any interested readers to finish themselves.

Problem 3

Let n be an integer greater than 1 and let X be an n-element set. A non-empty collection of subsets A_1,\ldots, A_k of X is tight if the union A_1 \cup \dots \cup A_k is a proper subset of X and no element of X lies in exactly one of the A_is. Find the largest cardinality of a collection of proper non-empty subsets of X, no non-empty subcollection of which is tight.

Note. A subset A of X is proper if A\neq X. The sets in a collection are assumed to be distinct. The whole collection is assumed to be a subcollection.

By Neel Nanda:

If |X|=n, there are 2^n possible subsets, so at first glance the answer could be a variety of things, from a linear to an exponential function of n, each of which would suggest a different approach. So the first step is to conjecture an answer, and by examining small cases it seems impossible to do better than 2n-2. There are several natural constructions for this bound, such as n subsets of size (n-1) and (n-2) subsets of size 1, so we guess this to be our answer (which later turn out to be right!).

From here, a solution is deceptively simple, though empirically the five full solutions in the contest show that it was by no means easy to find. We proceed by induction on the size of X, and want to show that any collection of subsets S has size at least (2n-2). By assumption all subcollections are not tight, so if the union of a subcollection is not the whole set X, then there is an element which appears in exactly one subset. This is a useful result, so we’d like to force a subcollection whose union is not the whole set X.

One way to guarantee that the union of a subcollection is not X is by taking the subcollection of all subsets not containing some element b. So there is some element c which appears in only one subset not containing b. If we choose b so that it’s the element contained in the fewest subsets of S, c is in at least as many subsets of S, but in only one subset not containing b. This means that at most one subset containing b doesn’t contain c. This is useful, because after removing at most 2 subsets (the coefficient of n in 2n-2, importantly!), we now have that every subset in S either contains both b and c or neither. This means that we can replace the pair (b,c) with a new element d, to get a new collection of subsets S’ of a set X’, of size n-1, so by induction |S| \le |S'|+2\le 2n-2.

There is also the case where all subsets contain b, but we can create an equivalent collection of subsets of X \ {b} by removing b from all subsets. So again by induction we are done.

Problem 6

Let ABCD be any convex quadrilateral and let P, Q, R, S be points on the segments AB, BC, CD, and DA, respectively. It is given that the segments PR and QS dissect ABCD into four quadrilaterals, each of which has perpendicular diagonals. Show that the points P, Q, R, S are concyclic.

I thought this problem was extremely hard. The official solution starts with a ‘magic lemma’, that isn’t quite so magic if you then read how it’s used. The overall claim is that PQ, RS and AC are concurrent (or parallel), and this is proved using the fact that the radical axis of the two circles with diameters PQ and RS also passes through this point of concurrency. Hunting for key properties of subsets of points in the diagram is an important skill in hard olympiad geometry, since it exactly reflects how problem-setters produce the problems. All the more so when there is lots of symmetry in the construction. But this is a hard example – there are a lot of potentially relevant subsets of the configuration.

When you’re really stuck with how to get involved in a synthetic configuration, you might consider using coordinates. Some of the UK students have been reading some chapters of a book (Euclidean Geometry in Mathematical Olympiads by Evan Chen. I’ve only had my own copy for a couple of days, but my initial impression is very positive – it fills a gap in the literature in a style that’s both comprehensive and readable.) focusing on various analytic approaches, so James and I felt it was safer to make sure we knew what the best settings were, and how far we could take them.

You almost certainly want the intersection of PR and QS to be your origin. I wanted to set up the configuration using the language of vectors, referenced by (P,Q,R,S). This was because PQ\perp BO and so on, hence \mathbf{b}\cdot (\mathbf{q}-\mathbf{p})=0 and so on. An alternative is to use complex numbers, which makes this condition a bit more awkward, but is more promising for the conclusion. Concyclity is not a natural property in vectors unless you can characterise the centre of the circle, but can be treated via cross-ratios in \mathbb{C}. You also have to decide whether to describe the collinearity of A, B and P by expressing \mathbf{p}=\lambda_{\mathbf{p}} \mathbf{a}+(1-\lambda_{\mathbf{p}})\mathbf{b}, or via something more implicit. There definitely are not four degrees of freedom here, since specifying A certainly defines at most one valid set of (B,C,D), so one is mindful we’ll have to eliminate many variables later. We also have to account for fact that \mathbf{r} is a negative scalar multiple of \mathbf{p}, and it’s not clear whether it’s better to break symmetry immediately, or use this towards the end of a calculation.

The point of writing this was that if your initial thought was ‘this looks promising via coordinate methods’, then I guess I agree. But there’s a difference between looking promising, and actually working, and there are lots of parameterisation options. It’s certainly worth thinking very carefully about which to choose, and in this case, challenging though they were, the synthetic or synthetic-trigonometric methods probably were better.


The Combinatorial Nullstellensatz

I’ve been taking a TCC course this term on Additive Combinatorics, delivered via video link from Bristol by Julia Wolf. At some point once the dust of this term has settled, I might write some things about the ideas of the course I’ve found most interesting, in particular the tools of discrete Fourier analysis to get a hold on some useful combinatorial properties of subsets of \mathbb{Z}/n\mathbb{Z} for example.

For this post, I want to talk instead about a topic that was merely mentioned in passing, the Combinatorial Nullstellensatz. The majority of this post is based on Alon’s original paper, which can be found here, and Chapter 9 of Tao and Vu’s book Additive Combinatorics. My aim is to motivate the theorem, give a proof, introduce one useful application from additive combinatorics, and solve Q6 from IMO 2007 as a direct corollary.

What does Nullstellensatz mean? Roughly speaking, it seems to mean ‘a theorem specifying the zeros’. We will be specifying the zeros of a polynomial. We are comfortable with how the zeros of a complex-valued polynomial of one variable behave. The number of zeros is given precisely by the degree of the polynomial (allowing appropriately for multiplicity). It is generally less clear how we might treat the zeros of a polynomial of many variables. The zero set is likely to be some surface, perhaps of dimension one less than the number of variables. In particular, it no longer really makes sense to talk about whether this set is finite or not. The Combinatorial Nullstellensatz gives us some control over the structure of this set of zeros.

The idea behind the generalisation is to view the Fundamental Theorem of Algebra as a statement not about existence of roots, but rather about (combinatorial) existence of non-roots. That is, given a polynomial P(x) of degree n, for any choice of (n+1) complex numbers, at least one of them is not a root of P. This may look like a very weak statement in this context, where we only expect finitely many roots anyway, but in a multivariate setting it is much more intuitively powerful.

Recall that the degree of a monomial is given by the sum of the exponents of the variables present. So the degree of 4x^2 y^3 z is 6. The degree of a polynomial is then given by the largest degree of a monomial in that polynomial. A polynomial P(x_1,\ldots,x_n) over a field F with degree d might have lots of monomial terms of degree d. Suppose one of these monomials is x_1^{d_1}\ldots x_n^{d_n}, where \sum d_i=d. Then one version of the Combinatorial Nullstellensatz asserts that whenever you take subsets of the base field S_i\subset F with |S_i|\ge d_i+1, then there is a point with x_i\in S_i such that P(x_1,\ldots,x_n)=0.

In other words, you can’t have a box (ie product of sets) of dimension d_1+1 \times d_2+1 \times\ldots\times d_n+1 on which the polynomial is zero.

Unsurprisingly, the proof proceeds by induction on the number of variables. Alon’s result proceeds via a more general theorem giving information about the possibility of writing multinomial polynomials as linear combinations of polynomials in one variable.

We would like to start this induction by fixing the x_n co-ordinate, then viewing P as a polynomial in x_1,\ldots,x_{n-1} only. One problem with this approach is that the largest degree monomials in P are not necessarily still the largest degree monomials in P with x_n fixed. So we need to apply a division algorithm argument.

I’m going to miss some steps so as to keep this of suitable blog post length. The key idea is to apply the division algorithm to P with respect to the simplest polynomial that is zero on all of S_n, which we define as:

g(x_n)=\prod_{s_n\in S_n}(x_n-s_n).

We can decompose as


So now we ask where the term x_1^{d_1}\ldots x_n^{d_n} is coming from, bearing in mind that d_n<|S_n|. The lower order terms in g cannot contribute to this, as  they cannot be of maximal degree. Also, the first term in q_n(\mathbf{x})g(x_n) cannot contribute as the exponent of x_n is too large. So the term in question must be coming from r_{n,d_n}(x_1,\ldots,x_{n-1})x_n^{d_n}. So now we can apply the induction hypothesis to the polynomial r_{n,d_n} to find $x_1\in S_1,\ldots, x_{n-1}\in S_{n-1}$ such that r_{n,d_n}(x_1,\ldots,x_{n-1} is non-zero. With these values, we can view the remainder as a polynomial in x_n of degree |S_n|>d_n, and so there is an x_n\in S_n such that

\sum_{j=0}^{|S_n|}r_{n,j}(x_1,\ldots,x_{n-1})x_n^j)\neq 0.

This concludes the proof by induction.

I want to discuss two relatively simple applications. The first is the Cauchy-Davenport Theorem, which one might view as the first non-trivial theorem in additive combinatorics, giving a bound on the size of a sumset.

Theorem (Cauchy-Davenport): Given A, B non-empty subsets of Z_p for p a prime, then

|A+B|\geq \min\{p,|A|+|B|-1\}.

( A+B:=\{c: c=a+b,a\in A,b\in B\} )

Note that the result isn’t especially surprising. Providing some sort of ordering to the elements of A and B might be a sensible way to proceed. Certainly if they were sets in \mathbb{Z}, this would give a proof immediately.

Proof: Only the case |A|+|B| <= p is interesting. Following Alon’s argument, suppose that |A+B| <= |A|+|B|-2, and let C=A+B. Set f(x,y)=\prod_{c\in C}(x+y-c), so f(a,b)=0 for all a\in A,b\in B.

Then the coefficient of x^{|A|-1}y^{|B|-1} in f is \binom{|A|+|B|-2}{|A|-1} as we have to choose which of the terms in the product supply an x and which supply a y. This is non-zero (in Z_p recall) since the upper integer is less than p. The Combinatorial Nullstellensatz then gives a contradiction.

My second example is from the IMO in Vietnam which I attended. I spent a lot of time thinking about this problem, but made no progress.

IMO 2007 Question 6: Let n be a positive integer. Consider

S=\{(x,y,z) | x,y,z\in \{0,1,\ldots,n\}, x+y+z>0\}

as a set of (n+1)^3-1 points in 3D space. Determine the smallest number of planes, the union of which contains S but does not include (0,0,0).

Answer: 3n. Consider the planes x+y+z = k for k varying between 1 and 3n. The aim is to prove that you cannot do it with fewer.

To prove this, suppose we can do with fewer planes, say k. We write the equation of a plane as


Note that the d’s are non-zero as (0,0,0) must not be a solution. Then take the product of all these degree one polynomials together and subtract a multiple of

\prod_{i=1}^n (x-i)(y-i)(z-i),

with the multiple chosen so the resulting polynomial has a root at (0,0,0). (This constant must be non-zero to cancel the non-zero product of the d’s.) This resulting polynomial is degree 3n by construction, and x^ny^nz^n has a non-zero coefficient, but it is zero on the box [0,n]^3, which contradicts Combinatorial Nullstellensatz.

Exponentials kill Polynomials

I gave my second session at the UK IMO training and selection camp at Trinity Cambridge earlier today. This one to a group of the more experienced students on the subject of polynomials. This always feels like a tricky topic to present to olympiad students. I always felt that there were lots of useful connections between roots and coefficients, but it was hard to get a handle on exactly what sort of relationship would be useful for each question. Perhaps the main problem is that any of the natural interesting things to talk about lie annoyingly on the fringes of complex analysis or abstract algebra. Or, at any rate, are best explained in that language, which isn’t particularly suitable at this stage when there’s only an hour and a half to play with.

One problem I was particularly keen for the students to attempt was a proof that exponential functions always grow faster than polynomials. I think this is a good problem to think about because it is so useful in all sorts of areas. In probability for example, polynomial decay and exponential decay are the two regimes generally discussed for the tail behaviour of distributions of random variables, and all sorts of things are qualitatively different in the two cases. It is also often a useful step in a proof when we need very crude bounds on function.

Anyway, how to prove it? Well the first stage is to prove that a polynomial of degree n+1 dominates any polynomial of strictly smaller degree. I am writing ‘dominate’ to mean, ‘is eventually larger than’, under the assumption that the leading coefficients are always positive. (As this seems easier than sticking modulus signs everywhere.)

This isn’t too hard. If we take


then for any x>|a_n|+\ldots+|a_0|, we must have x^{n+1}>P(x) eventually.

Now we introduce the exponential function. In most applications, it turns out to be most natural to use e^x, but for students who haven’t even necessarily done AS-levels I wasn’t happy using a concept whose definition might be rather unfamiliar.

If you are happy with the Taylor series definition


then the result is not too challenging. Given a polynomial P of degree k with positive leading coefficient, by the previous result we have that eventually

P(x)<1+x+\frac{x^2}{2!}+\ldots+\frac{x^k}{k!}+\frac{x^{k+1}}{(k+1)!}\leq e^x.

Although the students were able to follow this proof, they were happier thinking about P(n)<2^n. Obviously, we could replace x by n log 2 in the previous argument, but I was pleased with the following direct proof. Ironically, this has much more of the flavour of analysis than the above.

First we can show by induction that n<2^{n/2} for n > 4. It makes sense to take the broader induction hypothesis 4<n<2^{n/2}, and then show that in this range “adding 1 gives you a smaller answer than multiplying by the square root of two”.

From the initial result about polynomials dominating smaller degree polynomials, it suffices to prove the result for P(x)=x^k, for some k, rather than arbitrary polynomials. Now we can proof this by induction on k, the degree of P. We can prove the base case k = 1 via the previous paragraph.

If n^k<2^n eventually, then (4n)^k<2^{n+2k}, so by changing variables, n^k<2^{n/4+2k} eventually, which is in turn < 2^{n/2}. So, eventually

n^{k+1}<2^{n/2}\cdot 2^{n/2}=2^{n}.