# BMO2 2018

The second round of the British Mathematical Olympiad was taken yesterday by the 100 or so top scoring eligible participants from the first round, as well as some open entries. Qualifying for BMO2 is worth celebrating in its own right. The goal of the setters is to find the sweet spot of difficult but stimulating for the eligible participants, which ultimately means it’s likely to be the most challenging exam many of the candidates sit while in high school, at least in mathematics.

I know that lots of students view BMO2 as something actively worth preparing for. As with everything, this is a good attitude in moderation. Part of the goal in writing about the questions at such length (and in particular not just presenting direct solutions) is because I think at this level it’s particularly easy to devote more time than needed to preparation, and use it poorly.

All these questions could be solved by able children. In fact, each could be solved by able children in less than an hour. You definitely count as an able child if you qualified or if your teacher allowed you to make an open entry! Others count too naturally. But most candidates won’t in fact solve all the questions, and many won’t solve any. And I think candidates often come up with the wrong reasons why they didn’t solve problems. “I didn’t know the right theorems” is very very rarely the reason. Olympiad problems have standard themes and recurring tropes, but the task is not to look at the problem and decide that it is an example of Olympiad technique #371. The task is actually to have as many ideas as possible, and eliminate the ones that don’t work as quickly as possible.

The best way to realise that an idea works is to solve the problem immediately. For the majority of occasions when we’re not lucky enough for that to happen, the second-best way to realise that an idea works is to see that it makes the problem look a bit more like something familiar. Conversely, the best way to realise that an idea doesn’t work is to observe that if it worked it would solve a stronger but false problem too. (Eg Fermat’s Last Theorem *does* have solutions over the reals…) The second-best way to realise that an idea doesn’t work is to have the confidence that you’ve tried it enough and you’ve only made the problem harder, or less familiar.

Both of these second-best ideas do require a bit of experience, but I will try to explain why none of the ideas I needed for various solutions this year required any knowledge beyond the school syllabus, some similarities to recent BMOs, and a small bit of creativity.

As usual, the caveat that these are not really solutions, and certainly not official solutions, but they are close enough to spoil the problems for anyone who hasn’t tried them by themselves already. Of course, the copyright for the problems is held by BMOS, and reproduced here with permission.

Question One

I wrote this question. Perhaps as a focal point of the renaissance of my interest in geometry, or at least my interest in teaching geometry, I have quite a lot to say about the problem, its solutions, its origin story, the use of directed angles, the non-use of coordinate methods and so on. In an ideal world I would write a book about this sort of thing, but for now, a long and separate post is the answer.

This will be available once I’ve successfully de-flooded my apartment.

Question Two

I also wrote this problem, though I feel it’s only fair to show the version I submitted to the BMO committee. All the credit for the magical statement that appears above lies with them. There is a less magical origin story as well, but hopefully with some interesting combinatorial probability, which is postponed until the end of this post.One quick observation is that in my version Joe / Hatter gets to keep going forever. As we shall see, all the business happens in the first N steps, but a priori one doesn’t know that, and in my version it forces you to strategise slightly differently for Neel / Alice. In the competition version, we know Alice is done as soon as she visits a place for a second time, but not in the original. So in the original we only have to consider ‘avoid one place’ rather than the multiple possibilities now of ‘avoid one place’ or ‘visit a place again’.

But I think the best idea is to get Alice to avoid one particular place $c\not\equiv 0$ whenever possible. At all times she has two possible options for where to go next, lets say $b_k+a_k, b_k-a_k$ in the language of the original statement. We lose nothing by assuming $-N/2 < a_k\le N/2$, and certainly it would be ridiculous for Joe / Hatter ever to choose $a_k=0$. The only time Alice’s strategy doesn’t work is when both of these are congruent to $c$, which implies $N\,|\, 2a_k$, and thus we must have $N= 2a_k$. In other words, Alice’s strategy will always work if N is odd.

I think it’s really worth noticing that the previous argument is weak. We certainly did not show that N must be odd for Alice to win. We showed that Alice can avoid a congruence class modulo an odd integer. We didn’t really need that odd integer to be N for this to work. In particular, if N has an odd factor p (say a prime), then the same argument works to show that we can avoid visiting any site with label congruent to 1 modulo p.

It’s actually very slightly more complicated. In the original argument, we didn’t need to use any property of $b_k$. But obviously here, if $b_k\equiv 1$ modulo p and $p\,|\,a_k$, then certainly $b_{k+1}\equiv 1$ modulo p. So we have to prove instead that Alice can ensure she never ‘visits 1 modulo p for the first time’. Which is fine, by the same argument.

So, we’ve shown that Neel / Alice wins if N is odd, or has an odd factor. The only values that remain are powers of 2. I should confess that I was genuinely a little surprised that Joe / Hatter wins in the power of 2 case. You can find a construction fairly easily for N=2 and N=4, but I suspected that might be a facet of small numbers. Why? Because it still felt we could avoid a particular site. In order for Alice’s strategy to fail, we have to end up exactly opposite the particular site at exactly the time when the next $a_k=N/2$, and so maybe we could try to avoid that second site as well, and so on backwards?

But that turned out to be a good example of something that got very complicated quite quickly with little insight. And, as discussed at the beginning, that’s often a sign in a competition problem that your idea isn’t so good. (Obviously, when composing a problem, that’s no guarantee at all. Sometimes things are true but no good ideas work.) So we want other ideas. Note that for N=4, the sequence (2,1,2) works for Joe / Hatter, because that forces Alice / Neel to visit either (0,2,1,3) or (0,2,3,1). In particular, this strategy gave Alice no control on the first step nor the last step, and the consequence is that we force her to visit the evens first, then transfer to an odd, and then force her to visit the other odd.

We might play around with N=8, or we might proceed directly to a general extension. If we have a Joe / Hatter strategy for N, then by doubling all the $a_k$s, we have a strategy for 2N which visits all the even sites in the first N steps. But then we can move to an odd site eg by taking $a_N=1$. Just as in the N=4 case, it doesn’t matter which odd site we start from, since if we again double all the $a_k$s, we will visit all the other odd sites. This gives us an inductive construction of a strategy for powers of two. To check it’s understood, the sequence for N=8 is (4,2,4,1,4,2,4).

Although we don’t use it, note that this strategy takes Alice on a tour of sites described by decreasing order of largest power of two dividing the label of the site.

Question Three

I have a theory that the average marks on Q1, Q2 and Q3 on this year’s paper will be in ascending order rather than, as one might expect, descending order. I think my theory will fail because it’s an unavoidable fact of life that in any exam, candidates normally start at the beginning, and don’t move to the middle until making earlier progress. But I think that’s the only reason my theory will fail.

Like kitchen cleanliness or children’s character flaws, it’s hard to compare one’s own problem proposals with others’ rationally. But I felt that, allowing for general levels of geometry non-preference, Q3 was more approachable than Q2, especially to any candidate who’d prepared by looking at some past papers.

I’m in no way a number theorist, but I know three or four common themes when one is asked to prove that a certain sequence contains no squares, or almost no squares. [3a]

• Number theoretic properties of the sequence of squares. Squares cannot be 3 modulo 4 for example. They also cannot be 2 modulo 4, and thus they also cannot be $2^{k-1}$ modulo $2^k$ for any even k. This first observation was essentially the body of most solutions to Q4 of BMO1 2016, among many others.
• Soft properties of the sequence of squares. The sequence of squares grows quadratically. Sometimes we can show a quadratic sequence will have no overlap with some other sequence for basic reasons. This is especially common if the second sequence is also quadratic or similar. For example, the expression $n^2+3n-4$ is typically not a square because

$(n+1)^2 = n^2+2n+1 < n^2 + 3n - 4 < n^2+4n+4 = (n+2)^2,$

• when n is large. In fact the right hand inequality is always true, and the left hand inequality is true for $n\ge 6$, which doesn’t leave too many cases to check (and n=5 does actually give a square). This type of argument has been quite common on BMO recently, directly on Q1 of BMO1 2011 and also Q3 of BMO1 2016. An example in a more abstract setting is Q3 of Balkan MO 2007, which I greatly enjoyed at the time…
• Number theoretic properties of the definition of a square. A square is the product of an integer with itself, and so if we want the product of two or more integers to be a square, then this imposes conditions on the shared factors of the two integers. I’ll cite some examples shortly.
• Huge theorems. Some old paper which I encountered as a child asked us to find all solutions to $x^2-1=2^y$. Or similar – I can’t find it now – but Q2 of BMO2 2006 is close enough to the sensible approach to the problem. I think it’s more helpful to think about this as proving that a particular sequence rarely includes powers of two than that a particular sequence rarely includes squares. But either way, one could in principle use the Catalan conjecture, which controls all non-trivial solutions to $a^p - b^q=1$. Fortunately, the Catalan conjecture was proved, by Mihailescu (readable blog about it), between the paper being set, and me attempting it a few years later. I’m being flippant. This is not a standard trope in solving these questions. For very obvious reasons. If it can be killed by direct reference to a known theorem, it won’t be set.

Anyway, those references (and more to follow) are to illuminate why I thought this question was not too hard. Indeed, I feel one can make substantial meta-progress in your head. The given information is interesting, but for the purpose of this question is just a black box. By subtracting the expression for m from the expression for 2m, we can derive an expression for the required sum. It’ll be a quartic in m, because the leading terms won’t cancel.

This leaves all three of the methods above very accessible. Unfortunately m=0 would be a square were it not excluded specifically, so a modular arithmetic approach is unlikely to work directly. Bounding between two quadratics is entirely plausible, as is factorising and comparing number theoretic properties of the factors. I thought the second one seemed more promising, but either way, having two potentially good ideas based only on recent BMO problems before even writing anything down is a good opening.

We do have to calculate the sum, and I make it $\frac{1}{4}m^2(5m+3)(3m+1)$. Now I’m not so sure how to bound this between two quadratics, because the leading coefficient is 15/4, which is not the square of a rational. But the factor analysis approach is definitely on.

Let’s review this generally. Throughout, suppose m,n are positive integers.

Claim 1: if mn is a square, then m and n are squares too.

Claim 2: if mn is a square, then m=n.

Both of these claims are false. However, a version of Claim 1 is true.

Claim 1′: if mn is a square, and m,n are coprime, then each is a square.

Even though this isn’t a named theorem, it is true, and well-known and can be used without proof. One way to prove it is to write m,n as products of primes, and show that since the primes are disjoint, the exponents must all be even. Most other methods will be equivalent to this, maybe with less notation.

What is good about Claim 1′ is that more complicated versions are true for for essentially similar reasons. For example

Claim 3: if mn is $6k^2$, and m,n are coprime, then either one is a square and the other is six times a square; or one is two times a square, and the other is three times a square.

Claim 4: if mn is a square, and the greatest common divisor (m,n) is either 5 or 1, then either each is a square, or each is five times a square.

I cited some examples of the other methods I proposed. Here are some examples of this sort of thing in recent BMOs:

• Q4 of BMO2 2016. Even the statement is suggestive. There are more complicated routes, but showing that $(2p-u-v)(2p+u+v)$ is a square is one way to proceed, and then Claim 4 directly applies after checking a gcd.
• Q2 of BMO1 2014 is similar, but it is much more explicit that this is the correct approach. Expose $p^2$ then use a (correct) version of Claim 2.
• Q1 of BMO2 2009. Show that a and b must each be a square times 41 for rationality reasons.
• Q6 of BMO1 2006. After sensible focused substitutions, obtain $3n^2=q(q-1)$. Rather than try to ‘solve’ this, extract the key properties along the lines of Claim 3, eliminate one of the cases by modular arithmetic, and return to the required statement.
• Q3 of BMO2 2010 requires the student to reproduce the essentials of the arguments above in the case of a particular degree six polynomial with a tractable factorisation, along with some mild square-sandwiching or bounding arguments as discussed earlier.

In conclusion, I’m trying to say that if I claim I am confident I can find all integers m such that $\frac14 m^2(5m+3)(3m+1)$ is a square, this is not based on complicated adult experience, but rather on recent problems at a similar sensible level. And I still don’t think it counts as Olympiad technique #371 – thinking about divisibility of factors is a good thing to do when talking about integers, and so it’s just a natural entry point into problems about squares. Plenty of problems might have this sort of thing as a starting point or an ending point.

For this problem we need a different ending point. To be brief, the factors (5m+3) and (3m+1) cannot both be squares because 5m+3 is never a square. So since the gcd of these factors is 1, 2 or 4, the only other option is that they are both squares times 2. And because -1 is not a square modulo 3, so 1 is not a (square times 2) modulo 3, and we are done. Note that this was a literal example of the first technique for proving something is not a square, proposed all the way back at the start of this section.

Footnotes

[3a] – some common themes for proving that sequences do include squares might be comparison with Pell’s Equations, or comparison with the explicit construction of solutions to Pythagoras’s equation.

Question Four

An example of an absorbing function is $f(x)=\lfloor x\rfloor$. One challenge is thinking of many other examples. This one is fine, but it’s true under replacing 2018 by 1 in the statement, and so it doesn’t really capture the richness of the situation.

Notation: the pre-image of a function is the language used to describe the inverse of a function which doesn’t have a uniquely-defined inverse. That is, if f is not injective, and multiple arguments have the output. We write $f^{-1}(y)=\{x: f(x)=y\}$. In particular, this is a set of values, not necessarily a single value. We also use $\mathbb{Z}$ to denote the integers. We can apply pre-images to sets as well. So for example $f^{-1}(\mathbb{Z})=\{x : f(x)\in \mathbb{Z}\}$.

This question is tricky, and I will be surprised to see many full solutions from the eligible candidates. It rewards the sort of organisation and clear-thinking that is easier said than done in a time-pressured contest environment. There are also many many possible things to consider, and so is particularly challenging in the short timeframe of BMO2 as opposed to, for example, appearing as the middle question on a 4.5 hour international-level paper.

At a meta-level we are being asked to confirm or deny the existence of absorbing functions where $f^{-1}(\mathbb{Z})$ is small in some sense, firstly when actually having finite size, secondly when, although infinite, being a small sort of infinite, namely spread out in a sparse, well-ordered way (you might say countable if familiar with that language). The general idea is presumably that it’s hard to be absorbing if the pre-image of the integers is small, and so it’s reasonable to assume that it’s too hard if this is finite; but perhaps not quite too hard if it’s merely countable. So (no, yes) is a sensible guess at the answer to the question, though (no, no) might also fit, maybe with a harder argument for the second no.

Ok, instead of trying a) or b), just play with the configuration. Let $A=f^{-1}(\mathbb{Z})$. We will use this frequently. In the picture below, f maps the real line on top to the real line below. If two reals get mapped to the same image, then whether or not the image is an integer, the whole (closed) interval bounded by the two reals also gets mapped to the same image. This is because f is weakly increasing.

This means that A consists of various intervals (which include single points). But in both a) and b) we know that A is ‘small’, and so it cannot contain any intervals of positive length. So in fact A is a set of separated real values. In the case of a) it’s a finite set.

Do we want to try and iterate this, and look at $f^{-1}(A)$? Well maybe, but we don’t know much about about pre-images of A, only about pre-images of $\mathbb{Z}$.

But note that the pre-image of the pre-image of the … of the pre-image [2017 times] of A must be the whole real line, so at some point, some value has a pre-image that is an interval. So if we’re guessing that the answer to b) is yes, then we need to give a construction.

$\mathbb{R} \stackrel{f}\longrightarrow ?? \stackrel{f}\longrightarrow\quad\ldots\quad \stackrel{f}\longrightarrow ??\stackrel{f}\longrightarrow A \stackrel{f}\longrightarrow f(A)\subset \mathbb{Z}.$

If you play around for a bit, it seems very unlikely to be absorbing if the integers don’t get mapped to the integers. You can try to prove this, but at the moment we’re just aiming for a construction, so let’s assume $f(\mathbb{Z})\subset \mathbb{Z}$. It would be convenient if f(n)=n for all $n\in \mathbb{Z}$, but we already know that this won’t work because then the pre-image of the pre-image of the… of $\mathbb{Z}$ is always $\mathbb{Z}$, but we need it to be $\mathbb{R}$.

The ideal situation would be if $A= \mathbb{Z}\cup \{\ldots, a'_{-1},a'_0,a'_1,\ldots\}$, where the pre-image of $\{\ldots, a'_{-1},a_0,a'_1,\ldots\}$ is pretty much everything.

Informally, we are specifically banned from mapping intervals directly onto an integer. So have an intermediate set, and try to map almost everything (except the integers and the set itself) onto that set, so and map that set into the integers.

At this point, you really just have to have the right idea and finish it. Many things will work, but this seems the easiest to me. Let the set A consist of the integers and the (integers plus 1/2). And for $x\in A$, f(x)=2x. This is what f looks like so far.

Here the black crosses are integers, and the purple crosses are (integers plus 1/2). But now we need to make as many reals as possible in the top row map to a purple cross (which is allowed, because purple crosses aren’t integers), but we need also to preserve the weakly increasing property. Fortunately, we can exactly do that. Each cross of either colour in the top row maps to a black cross in the middle row (ie an integer), so we can map the open interval between crosses in the top row to a purple cross in the middle row. As shown in red:

Note that this is consistent. The fact that I haven’t drawn in the red cones into the bottom row is only because I didn’t use the bottom row to motivate doing this. I’ve shown a consistent definition of f that maps all the reals onto the integers in two steps. If it’s an integer to begin with, that was great; if it was an (integer plus 1/2) to begin with then it becomes an integer in one step and stays an integer; and otherwise it first maps to an (integer plus 1/2), and then to an integer in the second step.

To check you’ve understood, try to write down a standalone definition of this function.

I’ve therefore solved part b) with the alternative condition $\ldots a_{-1} which isn’t exactly as required. It requires one small and simple idea to convert to a solution to the actual statement. See if you can find it yourself!

I think part a) is harder, not because the solution will look more complicated, but because there are so many potential partial results you could try to prove, because there are so many sets you could consider. To name a few: the image of f, the image of f intersected with $\mathbb{Z}$, the image of $\mathbb{Z}$, the 2018-composition image $f^{2018}(\mathbb{R})$, the 2018-composition image $f^{2018}(\mathbb{Z})$ and so on and so forth. You might have good insight into the wrong things.

For me, the crucial observation (which you can see from the figure in the b) construction) is that when composing an increasing function with itself, the ‘trajectories’ are either increasing or decreasing. That is, if $x\le f(x)$ (respectively, $x\ge f(x)$), then $x\le f(x)\le f^2(x)\le f^3(x)\le\ldots$ (respectively $x\ge f(x)\ge f^2(x)\ge \ldots$). Again, you can think of this as Olympiad technique #371 if you insist, but I don’t think that’s helpful. There are lots of things one could try to say here, and this turns out to be natural, true and useful, but you can’t know it’s useful until you play with it.

Anyway, we’re playing with part a), and we know that $f^k(x)$ is an integer for all large enough k, and that $f^{k+1}(x)$ is also an integer, so $f^k(x)$ is one of a finite set of integers because of the condition on A. But we’ve seen the sequence $x,f(x),f^2(x),\ldots$ is weakly increasing or weakly decreasing, and so if we also know it’s eventually bounded (because eventually it’s in this finite set) then it must eventually be constant. And this constant is one of the integers, say n. But unless we started from n, this means that f(n)=n, but also f(x)=n for some other real value x. And so exactly as at the very very beginning, that’s bad, because then the whole interval [x,n] gets mapped to n, which is a contradiction.

Question Two – Origin story

The origin story for Q2 started in a talk I heard by Renan Gross at Weizmann, who referenced some of the history of Scenery Reconstruction. Roughly speaking, we colour the integers (say with two colours), and then let loose a random walker, who tells us the sequence of colours she observes during her walk, but no other information about the walk itself.

How much information can we recover about the colouring? Obviously, the best we can hope for is to recover the colouring, up to translations and reflection, since for every possible random walk trajectory, the exact reflection is equally probable, and we are given no information about the starting point.

Since lots of the transitions between recoverable and unrecoverable depend on the periodicity of the colouring, a reasonable toy model is to do it on a cycle. Note that the Strong Law of Large Numbers tells us that we almost surely recover the number of black sites and white sites from an the infinite trajectory of the random walk. Of course it’s possible that there are only two black vertices, and they are adjacent, and the walker oscillates between them, thus seeing BBBBBB… But this is extremely unlikely. You could think of this in Bayesian terms as strongly increasing the prior on the whole cycle being black, but I think initially it’s best to do this as an infinite-time, SLLN problem not as finite time WLLN/CLT reweightings of anything.

But what more? It’s clear that the lengths of all black substrings should follow some mixed geometric-ish distribution, and this distribution will almost surely wash out as the empirical distribution in an SLLN sense. But it’s tricky to justify why such a mixed geometric-ish distribution should be uniquely determined by the lengths of black arcs in the cycle. But it does definitely feel like we should have enough information to reconstruct the colouring up to reflection/rotation with probability one. For example, analogously to the number of black vertices and the number of white vertices, we should be able to recover the number of adjacent black vertices, the number of adjacent white vertices, and the number of black-white adjacent vertices, and so on.

Anyway, this can be done, and it follows as a consequence of various authors’ work answering some more general conjectures of Benjamini and, separately, of den Hollander and Keane. Douglas Howard [DH] shows a handful of generalisations of this, as do Benjamini and Kesten [BK]. Most of this work is focused on sceneries on $\mathbb{Z}$, but periodic sceneries are often used as a basis, and of course, the only difference between periodic sceneries on $\mathbb{Z}$ and sceneries on the N-cycle are whether you know the period in advance. [BK] show that ‘almost all’ sceneries are distinguishable in a particular sense, in response to which Lindenstrauss [L99] exhibits a large family of sceneries which are not distinguishable. A readable but technical review is [ML].

So Renan’s talk was about the similar problem (and generalisations) on the hypercube [GG]. Rather than paraphrase the main differences badly, you can read his own excellent blog post about the work.

On the train back to Haifa from Rehovot, I was thinking a bit about the cycle case, and what happens if you generalise the random walk with varying jump lengths, or indeed introduce a demon walker, whose goal is to make it as hard as possible for the reviewer to deduce the colouring. One way this can certainly happen is if the walker can avoid visiting some particular site, as then how could one possibly deduce the colour of the never-visited site? And so we get to the statement posed.

References

[BK] – Benjamini, Kesten, 1996 – Distinguishing sceneries by observing the scenery along a random walk path

[dH] – den Hollander, 1988 – Mixing properties for random walk in random scenery

[DH] – Douglas Howard, 1996 – Detecting defects in periodic scenery by random walks on Z

[GG] – Grupel, Gross, 2017 – Indistinguishable sceneries on the Boolean hypercube

[L99] – Lindenstrauss, 1999 – Indistinguishable sceneries

[ML] – Matzinger, Lember, 2003 – Scenery reconstruction: an overview [link]

# Balkan MO 2017 – Qs 1, 3 and 4

The UK is normally invited to participate as a guest team at the Balkan Mathematical Olympiad, an annual competition between eleven countries from South-Eastern Europe. I got to take part in Rhodes almost exactly ten years ago, and this year the competition was held in Ohrid, in Macedonia. There’s one paper, comprising four questions, normally one from each of the agreed olympiad topic areas, with 4.5 hours for students to address them. The contest was sat this morning, and I’m going to say quite a bit about the geometric Q2, and a little bit about Qs 1 and 3 also. In all cases, this discussion will include most of a solution, with some commentary, so don’t read these if you are planning to try the problems yourself.

I’m not saying anything about Q4, because I haven’t solved it. (Edit: I have solved it now, so will postpone Q2 until later today.)

Question One

Find all ordered pairs of positive integers (x,y) such that

$x^3+y^3=x^2+42xy+y^2.$

The first thought is that if either of x or y is ‘large’, then the LHS is bigger than the RHS, and so equality can’t hold. That is, there are only finitely many solutions. The smallest possible value of y is, naturally, 1, and substituting y=1 is convenient as then $y^2=y^3$, and it’s straightforward to derive $x=7$ as a solution.

Regarding the non-existence of large solutions, you can make this precise by factorising the LHS as

$(x+y)(x^2-xy+y^2) = x^2+42xy+y^2.$

There are 44 terms of degree two on the RHS, and one term of degree in the second bracket on the LHS. With a bit of AM-GM, you can see then that if $x+y>44$, you get a contradiction, as the LHS will be greater than the RHS. But that’s still a lot of possibilities to check.

It struck me that I could find ways to reduce the burden by reducing modulo various primes. 2, 3 and 7 all divide 42, and furthermore cubes are nice modulo 7 and squares are nice modulo 3, so maybe that would bring the number of possibilities down. But my instinct was that this wasn’t the right way to use the fact that we were solving over positive integers.

The second bracket in the factorisation looks enough like the RHS, that it’s worth exploring. If we move $x^2-xy+y^2$ from the right to the left, we get

$(x+y-1)(x^2-xy+y^2) = 43xy.$ (1.1)

Now it suddenly does look useful that we are solving over positive integers, because 43 is a prime, so has to appear as a factor somewhere on the LHS. But it’s generally quite restrictive that $x^2-xy+y^2 | 43xy$. This definitely looks like something that won’t hold often. If x and y are coprime, then certainly $x^2-xy+y^2$ and $y$ are coprime also. But actually if x and y have a non-trivial common factor d, we can divide both sides by $d^2$, and it still holds. Let’s write

$x=dm,\quad y=dn,\quad\text{where }d=\mathrm{gcd}(x,y).$

Then $m^2 -mn+n^2$ really does divide 43, since it is coprime to both m and n. This is now very restrictive indeed, since it requires that $m^2-mn+n^2$ be equal to 1 or 43. A square-sandwiching argument gives $m^2-mn+n^2=1$ iff $m=n=1$. 43 requires a little bit more work, with (at least as I did it) a few cases to check by hand, but again only has one solution, namely $m=7, n=1$ and vice versa.

We now need to add the common divisor d back into the mix. In the first case, (1.1) reduces to $(2d-1)=43$, which gives $(x,y)=(22,22)$. In the second case, after cancelling a couple of factors, (1.1) reduces to $(8d-1)=7$, from which $(x,y)=(7,1),(1,7)$ emerges, and these must be all the solutions.

The moral here seemed to be that divisibility was a stronger tool than case-reduction. But that was just this question. There are other examples where case-reduction is probably more useful than chasing divisibility.

Question Three

Find all functions $f:\mathbb{N}\rightarrow\mathbb{N}$ such that

$n+f(m) \,\big|\, f(n)+nf(m)$

for all $m,n\in\mathbb{N}$.

What would be useful here? There are two variables, and a function. It would be useful if we could reduce the number of variables, or the number of occurences of f. We can reduce the number of variables by taking m=n, to get

$n+f(n) \,\big|\, f(n) [1+n].$ (3.1)

From this, we might observe that $f(n)\equiv 1$ is a solution. Of course we could analyse this much more, but this doesn’t look like a 10/10 insight, so I tried other things first.

In general, the statement that $a\,|\,b$ also tells us that $a\,|\, b-ka$. That is, we can subtract arbitrary multiples of the divisor, and the result is still true. A recurring trope is that the original b is elegant, but an adjusted b-ka is useful. I don’t think we can do the latter, but by subtracting $n^2 +nf(m)$ from the problem statement, we get

$n+f(m) \,\big|\, n^2-f(n).$ (3.2)

There’s now no m on the RHS, but this relation has to hold for all m. One option is that $f(n)=n^2$ everywhere, then what we’ve deduced always holds since the RHS is zero. But if there’s a value of n for which $f(n)\ne n^2$, then (3.2) is a very useful statement. From now on, we assume this. Because then as we fix n and vary m, we need $n+f(m)$ to remain a divisor of the RHS, which is fixed, and so has finitely many divisors. So $f(m)$ takes only finitely many values, and in particular is bounded.

This ties to the observation that $f\equiv 1$ is a solution, which we made around (3.1), so let’s revisit that: (Note, there might be more elegant ways to finish from here, but this is what I did. Also note, n is no longer fixed as in previous paragraph.)

$n+f(n) \,\big|\, f(n) [1+n].$ (3.1)

Just to avoid confusion between the function itself, and one of the finite collection of values it might take, let’s say b is a value taken by f. So there are values of n for which

$n+b \,\big|\, b(1+n).$

By thinking about linear equations, you might be able to convince yourself that there are only finitely many solutions (in n) to this relation. There are certainly only finitely many solutions where LHS=RHS (well, at most one solution), and only finitely many where 2xLHS=RHS etc etc. But why do something complicated, when we can actually repeat the trick from the beginning, and subtract $b(n+b)$, to obtain

$n+b \,\big|\, b^2-b.$

For similar reasons to before, this is a great deduction, because it means if $b\ne 1$, then the RHS is positive, which means only finitely many n can satisfy this relation. Remember we’re trying to show that no n can satisfy this relation if $b\ne 1$, so this is definitely massive progress!

If any of what’s already happened looked like magic, I hope we can buy into the idea that subtracting multiples of the divisor from the RHS is the only tool we used, and that making the RHS fixed gives a lot of information about the LHS as the free variable varies. The final step is not magic either. We know that f is eventually 1. If you prefer “for large enough n, $f(n)=1$,” since all other values appear only finitely often. I could write this with quantifiers, but I don’t want to, because that makes it seem more complicated than it is. We genuinely don’t care when the last non-1 value appears.

Anyway, since we’ve deduced this, we absolutely have to substitute this into something we already have. Why not the original problem statement? Fix m, then for all large enough n

$n+f(m) \,\big|\, 1+nf(m).$ (3.3)

To emphasise, (3.3) has to hold for all large enough n. Is it possible that f(m)=2? Again, it’s easy to convince yourself not. But, yet again, why not use the approach we’ve used so profitably before to clear the RHS? In fact, we already did this, and called it (3.2), and we can make that work [3.4], but in this setting, because f(m) is fixed and we’re working with variable large n, it’s better to eliminate n, to get

$n+f(m)\,\big|\, f(m)^2-1,$

again for all large enough n. By the same size argument as before, this is totally impossible unless f(m)=1. Which means that in fact $f(m)=1$ for all m. Remember ages ago we assumed that f(n) was not $n^2$ everywhere, so this gives our two solutions: $f(n)=1,\, f(n)=n^2$.

Moral: choosing carefully which expression to work with can make life much more interesting later. Eliminating as many variables or difficult things from one side is a good choice. Playing with small values can help you understand the problem, but here you need to think about soft properties of the expression, in particular what happens when you take one variable large while holding another fixed.

[3.4] – if you do use the original approach, you get $n^2-1$ on the RHS. There’s then the temptation to kill the divisibility by taking n to be the integer in the middle of a large twin prime pair. Unfortunately, the existence of such an n is still just a conjecture

Question Four

(Statement copied from Art of Problem Solving. I’m unsure whether this is the exact wording given to the students in the contest.)

On a circular table sit n>2 students. First, each student has just one candy. At each step, each student chooses one of the following actions:

(A) Gives a candy to the student sitting on his left or to the student sitting on his right.

(B) Separates all its candies in two, possibly empty, sets and gives one set to the student sitting on his left and the other to the student sitting on his right.

At each step, students perform the actions they have chosen at the same time. A distribution of candy is called legitimate if it can occur after a finite number of steps.
Find the number of legitimate distributions.

My moral for this question is this: I’m glad I thought about this on the bus first. What I found hardest here was getting the right answer. My initial thoughts:

• Do I know how to calculate the total number of possibilities, irrespective of the algorithm? Fortunately yes I do. Marbles-in-urns = barriers between marbles on a line (maybe add one extra marble per urn first). [4.1]
• What happens if you just use technique a)? Well first you can get into trouble because what happens if you have zero sweets? But fine, let’s temporarily say you can have a negative number of sweets. If n is even, then there’s a clear parity situation developing, as if you colour the children red and blue alternately, at every stage you have n/2 sweets moving from red children to blue and vice versa, so actually the total number of sweets among the red children is constant through the process.
• What happens if you just use technique b)? This felt much more promising.
• Can you get all the sweets to one child? I considered looking at the child directly opposite (or almost-directly opposite) and ‘sweeping’ all the sweets away from them. It felt like this would work, except if for some parity reason we couldn’t prevent the final child having one (or more, but probably exactly one) sweets at the crucial moment when all the other sweets got passed to him.

Then I got home, and with some paper, I felt I could do all possibilities with n=5, and all but a few when n=6. My conjecture was that all are possible with n odd, and all are possible with n even, except those when none of the red kids or none of the kids get a sweet. I tried n=8, and there were a few more that I couldn’t construct, but this felt like my failure to be a computer rather than a big problem. Again there’s a trade-off between confirming your answer, and trying to prove it.

Claim: If n is even, you can’t achieve the configurations where either the red children or the blue children have no sweets.

Proof: Suppose you can. That means there’s a first time that all the sweets were on one colour. Call this time T. Without loss of generality, all the sweets are on red at T. Where could the sweets have been at time T-1? I claim they must all have been on blue, which contradicts minimality. Why? Because if at least one red child had at least one sweet, they must have passed at least one sweet to a blue neighbour.

Now it remains to give a construction for all other cases. In the end, my proof has two stages:

Step One: Given a configuration, in two steps, you can move a candy two places to the right, leaving everything else unchanged.

This is enough to settle the n odd case. For the even case, we need an extra step, which really corresponds to an initial phase of the construction.

Step Two: We can make some version of the ‘sweeping’ move precise, to end up in some configuration where the red number of children have any number of sweets except 0 or n.

Step one is not so hard. Realising that step one would be a useful tool to have was probably the one moment where I shifted from feeling like I hadn’t got into the problem to feeling that I’d mostly finished it. As ever in constructions, working out how to do a small local adjustment, which you plan to do lots of times to get a global effect, is great. (Think of how you solve a Rubik’s cube for example.)

Step two is notationally fiddly, and I would think very carefully before writing it up. In the end I didn’t use the sweeping move. Instead, with the observation that you can take an adjacent pair and continually swap their sweets it’s possible to set up an induction.

Actual morals: Observing the possibility to make a small change in a couple of moves (Step one above) was crucial. My original moral does still hold slightly. Writing lots of things down didn’t make life easier, and in the end the ideas on the bus were pretty much everything I needed.

[4.1] – one session to a group of 15 year olds is enough to teach you that the canon is always ‘marbles in urns’ never ‘balls’ nor ‘bags’, let alone both.

# RMM 2017 – Problems 2, 3 and 6

In the previous post, I discussed Problems 1, 4 and 5 from this year’s Romanian Master of Mathematics competition. In this post, I discuss the harder problems (modulo my subjective appreciation of difficulty).

Problem 2

Determine all positive integers n satisfying the following condition: for every monic polynomial P of degree at most n with integer coefficients, there exists a positive integer $k \leq n$, and (k+1) distinct integers $x_1,\ldots,x_{k+1}$ such that

$P(x_1) + P(x_2) + \cdots + P(x_k) = P(x_{k+1}).$

Parsing this question deserve at least a moment. Straight after a first reading, I find it worth writing down any key quantifiers which I might forget later. Here, it’s the words at most. If you want to show the statement holds for n=2, you need to investigate monic polynomials with degree zero, one and two. You should also make sure that any instances of $x_i$ really are always distinct.

This matters in competitions! Two of our contestants failed to get the mark for showing n=2 works, precisely because of not checking the linear case, and a third could have lost it for using examples which are sometimes not distinct. On hard papers, one mark actually is the difference between triumph and frustration. And of course it matters outside competitions too, since small cases are exactly what your reader might examine first, to check they understand the problem posed, so it’s not a good place for awkward errors.

I started by trying to show that it couldn’t possibly happen that every polynomial with degree at most n had this property, for some combinatorial reason. For example, that if every set of distinct integers could only be a solution set for a small number of polynomials, then we would end up with not enough polynomials. But I couldn’t make this work at all; every bound ended up heavily in the wrong direction.

The next natural question is, does a typical polynomial of degree at most n have this property? But choosing a typical polynomial is hard, so in fact I asked, do the simplest polynomials of degree at most n have this property? I think the simplest polynomials of degree at most n are $\{1,x,x^2,\ldots,x^n\}$. Under what circumstances does

$x_1^m + \ldots x_k^m = x_{k+1}^m,$ (1)

have solutions in distinct integers? Famously, when k=2 and $m\ge 3$ this is a very very hard problem indeed. So the first point is that it though it might be useful to use Fermat’s Last Theorem, it would be foolish to pursue a strategy which, if successful, would have a proof of FLT as a sub-problem. At least, it would be foolish if the aim was to finish this strategy within a few hours.

So my main comment on this question is meta-mathematical. If lots of attempts at general arguments don’t work, there must be some special example that does it. And what properties do I want this special example to have? Maybe one might have thought of this from scratch, but my motivation came from (1) in the case m=p-1. Then, by Fermat’s Little Theorem, all the summands are equal to 1 or 0 modulo p. If k>p, then after discounting any uniform factors of p, we obtain a congruence equation which is, in informal terms,

$\left(0\text{ or }1\right)+\ldots+\left(0\text{ or }1\right) \equiv \left(0\text{ or }1\right).$

This looks really promising because it’s quite restrictive, but it’s still just a bit annoying: there are quite a few solutions. But it does give us the right idea, which is to find a polynomial P for which $P(x)\equiv 1$ modulo n. The equation $1+\ldots+1\equiv 1$ modulo n has solutions only if the number of summands on the LHS is 1 modulo n. So in this context, this reduces to showing that P is, additionally, injective on the integers, ie that P(x)=P(y) only when x=y.

It’s a nice exercise to show the existence of polynomials which are constant modulo n, and a good problem to work out how to force injectivity. If a polynomial is increasing everywhere, then it is certainly injective, and so the problem ends up being slightly easier in the case where the degree is odd than when the degree is even, but this is a nice conclusion to a nice problem, so I’ll save it for any interested readers to finish themselves.

Problem 3

Let n be an integer greater than 1 and let X be an n-element set. A non-empty collection of subsets $A_1,\ldots, A_k$ of X is tight if the union $A_1 \cup \dots \cup A_k$ is a proper subset of X and no element of X lies in exactly one of the $A_i$s. Find the largest cardinality of a collection of proper non-empty subsets of X, no non-empty subcollection of which is tight.

Note. A subset A of X is proper if $A\neq X$. The sets in a collection are assumed to be distinct. The whole collection is assumed to be a subcollection.

By Neel Nanda:

If |X|=n, there are $2^n$ possible subsets, so at first glance the answer could be a variety of things, from a linear to an exponential function of n, each of which would suggest a different approach. So the first step is to conjecture an answer, and by examining small cases it seems impossible to do better than 2n-2. There are several natural constructions for this bound, such as n subsets of size (n-1) and (n-2) subsets of size 1, so we guess this to be our answer (which later turn out to be right!).

From here, a solution is deceptively simple, though empirically the five full solutions in the contest show that it was by no means easy to find. We proceed by induction on the size of X, and want to show that any collection of subsets S has size at least (2n-2). By assumption all subcollections are not tight, so if the union of a subcollection is not the whole set X, then there is an element which appears in exactly one subset. This is a useful result, so we’d like to force a subcollection whose union is not the whole set X.

One way to guarantee that the union of a subcollection is not X is by taking the subcollection of all subsets not containing some element b. So there is some element c which appears in only one subset not containing b. If we choose b so that it’s the element contained in the fewest subsets of S, c is in at least as many subsets of S, but in only one subset not containing b. This means that at most one subset containing b doesn’t contain c. This is useful, because after removing at most 2 subsets (the coefficient of n in 2n-2, importantly!), we now have that every subset in S either contains both b and c or neither. This means that we can replace the pair (b,c) with a new element d, to get a new collection of subsets S’ of a set X’, of size n-1, so by induction $|S| \le |S'|+2\le 2n-2$.

There is also the case where all subsets contain b, but we can create an equivalent collection of subsets of X \ {b} by removing b from all subsets. So again by induction we are done.

Problem 6

Let ABCD be any convex quadrilateral and let P, Q, R, S be points on the segments AB, BC, CD, and DA, respectively. It is given that the segments PR and QS dissect ABCD into four quadrilaterals, each of which has perpendicular diagonals. Show that the points P, Q, R, S are concyclic.

I thought this problem was extremely hard. The official solution starts with a ‘magic lemma’, that isn’t quite so magic if you then read how it’s used. The overall claim is that PQ, RS and AC are concurrent (or parallel), and this is proved using the fact that the radical axis of the two circles with diameters PQ and RS also passes through this point of concurrency. Hunting for key properties of subsets of points in the diagram is an important skill in hard olympiad geometry, since it exactly reflects how problem-setters produce the problems. All the more so when there is lots of symmetry in the construction. But this is a hard example – there are a lot of potentially relevant subsets of the configuration.

When you’re really stuck with how to get involved in a synthetic configuration, you might consider using coordinates. Some of the UK students have been reading some chapters of a book (Euclidean Geometry in Mathematical Olympiads by Evan Chen. I’ve only had my own copy for a couple of days, but my initial impression is very positive – it fills a gap in the literature in a style that’s both comprehensive and readable.) focusing on various analytic approaches, so James and I felt it was safer to make sure we knew what the best settings were, and how far we could take them.

You almost certainly want the intersection of PR and QS to be your origin. I wanted to set up the configuration using the language of vectors, referenced by (P,Q,R,S). This was because $PQ\perp BO$ and so on, hence $\mathbf{b}\cdot (\mathbf{q}-\mathbf{p})=0$ and so on. An alternative is to use complex numbers, which makes this condition a bit more awkward, but is more promising for the conclusion. Concyclity is not a natural property in vectors unless you can characterise the centre of the circle, but can be treated via cross-ratios in $\mathbb{C}$. You also have to decide whether to describe the collinearity of A, B and P by expressing $\mathbf{p}=\lambda_{\mathbf{p}} \mathbf{a}+(1-\lambda_{\mathbf{p}})\mathbf{b}$, or via something more implicit. There definitely are not four degrees of freedom here, since specifying A certainly defines at most one valid set of (B,C,D), so one is mindful we’ll have to eliminate many variables later. We also have to account for fact that $\mathbf{r}$ is a negative scalar multiple of $\mathbf{p}$, and it’s not clear whether it’s better to break symmetry immediately, or use this towards the end of a calculation.

The point of writing this was that if your initial thought was ‘this looks promising via coordinate methods’, then I guess I agree. But there’s a difference between looking promising, and actually working, and there are lots of parameterisation options. It’s certainly worth thinking very carefully about which to choose, and in this case, challenging though they were, the synthetic or synthetic-trigonometric methods probably were better.

# EGMO 2016 Paper II

Continuing from yesterday’s account of Paper I, this is a discussion of my thoughts about Paper II of EGMO 2016, happening at the moment in Busteni, Romania. This is not an attempt to describe official solutions, but rather to describe the thought process (well, a thought process) of someone tackling each question. I hope it might be interesting or useful, but for students, it will probably be more useful after at least some engagement with the problems. These are excellent problems, and reading any summary of solutions means you miss the chance to hunt for them yourself.

In actual news, you can follow the scoreboard as it is updated from Romania here. Well done to the UK team on an excellent performance, and hope everyone has enjoyed all aspects of the competition!

Question 4

Circles $\omega_1,\omega_2$ with the same radius meet at two points $X_1,X_2$. Circle $\omega$ is externally tangent to $\omega_1$ at $T_1$, and internally tangent to $\omega_2$ at $T_2$. Prove that lines $X_1T_1,X_2T_2$ meet on $\omega$.

Thought 1: I’m not the biggest fan of geometry ever, but I thought this looked like a nice problem, because it’s only really about circles, so I figured it probably wouldn’t require anything too exotic.

Thought 2: I bet lots of people try inversion. But the equal radius condition means I’m probably happy with the circles as they are. I hope lots of people don’t try to place the diagram in some co-ordinate system, even if it possible to do it sensibly (eg by making $\omega$ the reference circle).

Thought 3: The labelling of $X_1,X_2$ is unrelated to the rest of the indexing. So the intersection of $X_1T_2,X_2T_1$ should also lie on $\omega$, and possibly has some relationship (antipodal?) to the point I actually need to find out. But I can’t think of any reason why it’s easier to prove two points lie on a circle than just one, so let’s leave this as a thought rather than an idea.

Idea 1: I drew a terrible diagram on the back of a draft of my abstract, and for once, this was actually kind of helpful. Forget about radii being equal, one of them wasn’t even a circle. Anyway, while drawing in the later points, I was struggling to make it look convincingly like all the lengths which were supposed to be equal were in fact equal. So the idea was: almost all the segments in the diagram (once I’ve defined the circle centres $O_{\omega_1}$ etc) have one of two lengths (the radii of $\omega_1,\omega$ – red and green-ish in the diagram below), and with this in mind I can forget about the circles. We’ve got a rhombus, which is even better than a parallelogram, which is itself a really useful thing to have in a configuration. Another consequence of the proliferation of equal lengths is that almost all triangles are isosceles, and we know that similarity of isosceles triangles is particularly easy, because you only have to match up one angle.

Idea 2: How to prove it? We have to prove that two lines and a circle concur. This is where I actually need to stop and think for a moment: I could define the point where the lines meet and try to show it’s on the circle, or intersect one line with the circle, and show it’s on the other line. Idea 1 basically says I’m doing the problem using lengths, so I should choose the way that fits best with lengths.

If I define the point P where $X_2T_2$ meets the circle (this was easier to draw in my diagram), then I know $PO_\omega=T_2 O_\omega$ and so on. Then there were loads of isosceles triangles, and some of them were similar, which led to more parallel lines, and from this you could reverse the construction in the other direction to show that P also lay on the other line.

Question 5

Let k, n be integers such that $k\ge 2,\, k\le n\le 2k-1$. Place rectangular k x 1 or 1 x k tiles on an n x n chessboard in the natural way with no overlap until no further tile can be placed. Determine the minimum number of tiles that such an arrangement may contain.

Idea 1: It took me a while to parse the question. Minimum over what? I rephrased it in my head as: “to show the answer is eg n+5, I need to show that whenever you place n+4 tiles legally, you can’t add another. I also need to show that you can place n+5 such that you can’t add another.” This made life a lot easier.

Thought 1: What goes wrong if you take n=2k and beyond? Well, you can have two horizontal tiles on a given row. I’m not really sure how this affects the answer, but the fact that there is still space constraint for n<2k is something I should definitely use.

Diversion: I spent a while thinking that the answer was 4 and it was a very easy question. I spent a bit more time thinking that the answer was n, and it was a quite easy question, then realised that neither my construction nor my argument worked.

Thought 2: can I do the cases n=k,or 2k-1 or k+1? The answers were yes, unsure, and yes. The answer to k+1, which I now felt confident was actually four, was helpful, as it gave me a construction for k+2, …, 2k-1 that seemed good, even though it was clearly not optimal for 2k-1. Therefore, currently my potential answer has three regimes, which seemed unlikely, but this seemed a good moment to start trying to prove it was optimal. From now on, I’m assuming I have a configuration from which you can’t add another block.

Idea 2: About this diagram, note that once I’ve filled out the top-left (k+1)x(k+1) sub-board in this way, there are still lots of ways to complete it, but I do have to have (n-k-1) horizontal and (n-k-1) vertical tiles roughly where I’ve put them. Why? Because I can’t ‘squeeze in’ a vertical tile underneath the blue bit, and I can’t squeeze in a horizontal tile to the right of the blue bit. Indeed, whenever I have a vertical block, there must be vertical blocks either to the left or to the right (*) (or possibly both if we’re near the middle). We need to make this precise, but before doing that, I looked back at where the vertical blocks were in the proposed optimum, and it turns out that all but (k-1) columns include a vertical block, and these (k-1) columns are next to each other.

This feels like a great idea for two reasons: 1) we’ve used the fact that n<2k at (*). 2) this feels very pigeonhole principle-ish. If we had fewer tiles, we’d probably have either at least k columns or least k rows without a vertical (or, respectively, horizontal) tile. Say k columns don’t include a vertical tile – so long as they are next to each other (which I think I know) we can probably include a horizontal tile somewhere in there.

So what’s left to do? Check the previous sentence actually works (maybe it’s full of horizontal tiles already?), and check the numerics of the pigeonhole bound. Also work out how the case n=2k-1 fits, but it seems like I’ve had some (/most) of the good ideas, so I stopped here.

Question 6

I don’t actually want to say very much about this, because I didn’t finish off all the details. I want to talk briefly in quite vague terms about what to do if you think this problem looks scary. I thought it looked a bit scary because it looked similar to two number theoretic things I remember: 1) primes in arithmetic progressions. This is very technical in general, but I can remember how to do 3 mod 4 fairly easily, and 1 mod 4 with one extra idea; 2) if a square-free number can be written as a sum of two squares, this controls its factors modulo 4.

Vague Ideas: It seemed unlikely that this would involve copying a technical argument, so I thought about why I shouldn’t be scared. I think I shouldn’t be scared of the non-existence part. Often when I want to show there are no integer solutions to an equation, I consider showing there are no solutions modulo some base, and maybe this will be exactly what I do here. I’ll need to convert this statement about divisibility into an equation (hopefully) and check that $n\equiv 3,4$ modulo 7 doesn’t work.

For the existence of infinitely many solutions, maybe I’d use Chinese Remainder Theorem [1], or I’ll reduce it to something that I know has lots of solutions (eg Pythagoras), or maybe I can describe some explicit solutions?

Actual Idea 1: We’ve got $n^2+m | n^4$, but this is a very inefficient statement, since the RHS is a lot larger than the LHS, so to be useful I should subtract off a large multiple of the LHS. Difference of two squares is a good thing to try always, or I could do it manually. Either way, I get $n^2+m | m^2$ which is genuinely useful, given I know m=1,2, …, 2n, because the RHS is now comparable in size to the LHS, so I’ve narrowed it down from roughly n^2 possibilities to just three:

$n^2+m=m^2,\quad 2(n^2+m)=m^2,\quad 3(n^2+m)=m^2.$ (*)

I’m going to stop now, because we’ve turned it into a completely different problem, which might be hard, but at least in principle this is solvable. I hope we aren’t actually scared of (*), since it looks like some problems we have solved before. I could handle one of these in a couple of lines, then struggled a bit more with the other pair. I dealt with one by recourse to some theory, and the final one by recourse to some theory after a lot of rearranging which I almost certainly got wrong, but I think I made an even number of mistakes rather than an odd number because I got the correct solution set modulo 7. Anyway, getting to (*) felt like the majority of the ideas, and certainly removed the fear factor of the Q6 label, so to fit the purpose of this discussion I’ll stop here.

[1] During one lunch in Lancaster, we were discussing why Chinese Remainder Theorem is called this. The claim was that an ancient Chinese general wanted to know the size of their army but it was too big to count, so had them arrange themselves into columns of various sizes, and counted the remainders. The general’s views on the efficiency of this algorithm remain lost in the mists of time.