BMO1 2017 – Questions 5 and 6

The first round of the British Mathematical Olympiad was sat yesterday. The questions can be found here and video solutions here. My comments on the first four questions are in the previous post.

Overall, I didn’t think any of the questions on this paper were unusually difficult by the standard of BMO1, but I found everything slightly more time-consuming than typical. I thought Question 5 was a great problem, and I tried lots of things unsuccessfully, first, and so wanted to discuss it in slightly more technical language. For Question 6 I made a decisive mistake, which I’ll explain, and which cost a lot of time. But in general, my point is that the back end of the paper was a little fiddlier than normal, and required longer written solutions, and perhaps many students might have had less time than expected to attack them anyway after details earlier in the paper.

Question Five

As I said before, I thought this question was quite challenging. Not because the solution is particularly exotic or complicated, but because there were so many possible things that might have worked. In my opinion it would not have been out of place at the start of an IMO paper, because it’s perfectly possible to have enough good ideas that eliminating the ones that don’t work takes an hour, or hours. Even though it slightly spoils the flow of the solution, I’m particularly trying to emphasise the tangents that didn’t work, mostly for reassurance to anyone who spent a long time struggling.

I was thinking about this question in terms of a 2Nx2N board, where N is even, and for the given question equal to 100. I spent a while thinking that the bound was 8N-4, corresponding to taking the middle two rows and the middle two columns, but not the 2×2 square which is their intersection. If you think of a comb as a ‘handle’ of 1xN cells, with an extra N/2 alternating cells (say, ‘teeth’) bolted on, then it’s clear this construction works because there’s never space to fit in a handle, let alone the teeth.

I couldn’t prove that this was optimal though. A standard way to prove a given bound K was optimal would be to produce a tiling on the board with K combs, where every cell is included in exactly one comb. But this is clearly not possible in this situation, since the number of cells in a comb (which is 150) does not divide the total number of cells on the board.

Indeed, the general observation that if you take a comb and a copy of the comb rotated by 180, the teeth of the second comb can mesh perfectly with the teeth of the first comb to generate a 3xN unit. I wasted a moderate amount of time pursuing this route.

[Note, it will be obvious in a minute why I’m writing ‘shaded’ instead of ‘coloured’.]

But in motivating the construction, I was merely trying to shade cells so that they intersected every possible 1xN handle, and maybe I could prove that it was optimal for this. In fact, I can’t prove it’s optimal because it isn’t optimal – indeed it’s clear that a handle through one of the middle rows intersects plenty of shaded cells, not just one. However, with this smaller problem in mind, it didn’t take long to come up with an alternative proposal, namely splitting the board into equal quarters, and shading the diagonals of each quarter, as shown.

It seems clear that you can’t fit in a 1xN handle, and any sensible tiling with 1xN handles contains exactly one shaded cell, so this shading (with 4N shaded cells) is optimal. But is it optimal for a comb itself?

Consider a shading which works, so that all combs include a shaded cell. It’s clear that a comb is contained within a 2xN block, and in such a 2xN block, there are four possible combs, as shown.

You need to cover all these combs with some shading somewhere. But if you put the shaded cell on a tooth of comb A, then you haven’t covered comb B. And if you put the shaded cell on the handle of comb A, then you haven’t covered one of comb C and comb D. You can phrase this via a colouring argument too. If you use four colours with period 2×2, as shown

then any comb involves exactly three colours, and so one of them misses out the colour of the shaded cell. (I hope it’s clear what I mean, even with the confusing distinction between ‘shaded’ and ‘coloured’ cells.)

Certainly we have shown that any 2xN block must include at least two shaded cells. And that’s pretty much it. We have a tiling with 2N copies of a 2xN block, and with at least two shaded cells in each, that adds to at least 4N shaded cells overall.

Looking back on the method, we can identify another way to waste time. Tiling a board, eg a chessboard with dominos is a classic motif, which often relies on clever colouring. So it’s perhaps lucky that I didn’t spot this colouring observation earlier. Because the argument described really does use the local properties of how the combs denoted A-D overlap. An attempt at a global argument might start as follows: we can identify 2N combs which don’t use colour 1, and tile this subset of the grid with them, so we need to shade at least 2N cells from colours {2,3,4}. Similarly for sets of colours {1,3,4}, {1,2,4}, and {1,2,3}. But if we reduce the problem to this, then using roughly 2N/3 of each colour fits this global requirement, leading to a bound of 8N/3, which isn’t strong enough. [1]

Question Six

A word of warning. Sometimes it’s useful to generalise in problems. In Q5, I was thinking in terms of N, and the only property of N I used was that it’s even. In Q4, we ignored 2017 and came back to it at the end, using only the fact that it’s odd. By contrast, in Q2, the values did turn out to be important for matching the proof bounds with a construction.

You have to guess whether 300 is important or not here. Let’s see.

I have a natural first question to ask myself about the setup, but some notation is useful. Let a_1,a_2,\ldots,a_{300} be the ordering of the cards. We require that \frac{a_1+\ldots+a_n}{n} is an integer for every 1\le n\le 300. Maybe the values of these integers will be important, so hold that thought, but for now, replace with the divisibility statement that n | a_1+\ldots+a_n.

I don’t think it’s worth playing with small examples until I have a better idea whether the answer is 5 or 295. So the natural first question is: “what does it mean to have (a_1,\ldots,a_{n-1}) such that you can’t pick a suitable a_n?”

It means that there is no integer k in \{1,\ldots,300\}\backslash\{a_1,\ldots,a_{n-1}\} such that n\,\big|\,(a_1+\ldots+a_{n-1})+k, which for now we write as

k\equiv -(a_1+\ldots+a_{n-1})\,\mod n.

Consider the congruence class of -(a_1+\ldots+a_{n-1}) modulo n. There are either \lfloor \frac{300}{n}\rfloor or \lceil \frac{300}{n}\rceil integers under consideration in this congruence class. If no such k exists, then all of the relevant integers in this congruence class must appear amongst \{a_1,\ldots,a_{n-1}\}. At this stage, we’re trying to get a feel for when this could happen, so lower bounds on n are most relevant. Therefore, if we get stuck when trying to find a_n, we have

\lfloor \frac{300}{n} \rfloor\text{ or }\lceil \frac{300}{n}\rceil \le n-1, (*)

which is summarised more succinctly as

\lfloor \frac{300}{n} \rfloor \le n-1. (**)

[Note, with this sort of bounding argument, I find it helpful to add intermediate steps like (*) in rough. The chance of getting the wrong direction, or the wrong choice of \pm 1 is quite high here. Of course, you don’t need to include the middle step in a final write-up.]

We can check that (**) is false when n\le 17 and true when n\ge 18. Indeed, both versions of (*) are true when n\ge 18.

So we know the minimum failure length is at least 17. But is there a failing sequence of length 17? At a meta-level, it feels like there should be. That was a very natural bounding argument for 17 (which recall corresponds to n=18), and it’s easy to believe that might be part of an official solution. If we achieve equality throughout the argument, that’s most of the way to a construction as well. It won’t be so easy to turn this argument into a construction for n\ge 19 because there won’t be equality anywhere.

We have to hope there is a construction for n=18. What follows is a description of a process to derive (or fail to derive) such a construction. In a solution, one would not need to give this backstory.

Anyway, in such a construction, let \alpha\in\{1,2,\ldots,18\} describe the congruence class modulo 18 which is exhausted by \{a_1,\ldots,a_{17}\}. I’m going to hope that \alpha=18 because then the calculations will be easier since everything’s a multiple of 18. We haven’t yet used the fact that for a problem, we need \alpha\equiv-(a_1+\ldots+a_{n-1}). We definitely have to use that. There are 16 multiples of 18 (ie relevant integers in the congruence class), so exactly one of the terms so far, say a_j, is not a multiple of 18. But then

0 \equiv 0+\ldots+0+a_j+0+\ldots+0,

which can’t happen. With a bit of experimentation, we find a similar problem making a construction using the other congruence classes with 16 elements, namely \alpha\in \{13,14,\ldots,18\}.

So we have to tackle a different class. If \alpha\le 12 then our sequence must be

\alpha,18+\alpha,2\times 18 +\alpha, \ldots, 16\times 18 + \alpha,

in some order. In fact, let’s add extra notation, so our sequence is

(a_1,\ldots,a_{17}) = (18\lambda_1+ \alpha,\ldots,18\lambda_{17}+\alpha),

where (\lambda_1,\ldots,\lambda_{17}) is a permutation of {0,…,16}. And so we require

k \,\big|\, 18(\lambda_1+\ldots+\lambda_k) + k\alpha, (%)

for 1\le k\le 17. But clearly we can lop off that k\alpha, and could ignore the 18. Can we find a permutation \lambda such that

k \,\big|\, \lambda_1+\ldots+\lambda_k.

This was where I wasted a long time. I played around with lots of examples and kept getting stuck. Building it up one term at a time, I would typically get stuck around k=9,10. And I had some observations that in all the attempted constructions, the values of \frac{\lambda_1+\ldots+\lambda_k}{k} were around 8 and 9 too when I got stuck.

I became convinced this subproblem wasn’t possible, and decided that would be enough to show that n=18 wasn’t a possible failure length. I was trying to show the subproblem via a parity argument (how must the a_is alternate odd/even to ensure all the even partial sums are even) but this wasn’t a problem. Then I came up with a valid argument. We must have

\lambda_1+\ldots+\lambda_{17}=136= 16\times 8 + 8\quad\text{and}\quad 16\,\big|\,\lambda_1+\ldots+\lambda_{16},

which means \lambda_1+\ldots+\lambda_{16} must be 128 = 15×8 + 8, ie \lambda_{17}=8. But then we also have 15\,\big|\, \lambda_1+\ldots+\lambda_{15}, which forces $latex\lambda_{16}=8$ also. Which isn’t possible.

If this then hadn’t wasted enough time, I then tried to come up with a construction for n=19, for which there are lots more variables, and took a lot more time, and seemed to be suffering from similar problems, just in a more complicated way. So I became convinced I must have made a mistake, because I was forced down routes that were way too complicated for a 3.5 hour exam. Then I found it…

What did I do wrong? I’ll just say directly. I threw away the 18 after (%). This made the statement stronger. (And in fact false.) Suppose instead I’d thrown away a factor of 9 (or no factors at all, but it’s the residual 2 that’s important). Then I would be trying to solve

k\,\big|\,2(\lambda_1+\ldots+\lambda_k).

And now if you experiment, you will notice that taking \lambda_1=0,\lambda_2=1,\lambda_3=2,\ldots seems to work fine. And of course, we can confirm this, using the triangle number formula for the second time in the paper!

This had wasted a lot of time, but once that thought is present, we’re done, because we can go straight back and exhibit the sequence

(a_1,\ldots,a_{17}) = (1, 18+1,2\times 18 +1,\ldots, 16\times 18 +1).

Then the sum so far is congruent to -1 modulo 18, but we have exhausted all the available integers which would allow the sum of the first 18 terms to be a multiple of 18. This confirms that the answer to the question as stated is 17.

At the start, I said that we should be cautious about generalising. In the end, this was wise advice. We definitely used the fact that 18 was even in the stage I over-reduced the first time. We also used the fact that there was at least one value of \alpha with an ‘extra’ member of the congruence class. So I’m pretty sure this proof wouldn’t have worked with 288 = 16×18 cards.

Footnotes

[1] – If shading were a weighted (or continuous or whatever you prefer) property, ie that each cell has a quantity of shading given by a non-negative real number, and we merely demand that the total shading per comb is at least one, then the bound 8N/3 is in fact correct for the total shading. We could look at a 2xN block, and give 1/3 shading to one cell of each colour in the block. Alternatively, we could be very straightforward and apply 2/3N shading to every cell in the grid. The fact that shading has to be (in this language) zero or one, imposes meaningful extra constraints which involve the shape of the comb.

Advertisements

BMO1 2017 – Questions 1-4

The first round of the British Mathematical Olympiad was sat yesterday. The questions can be found here. I recorded some thoughts on the questions while I was in Cyprus, hence the nice Mediterranean sunset above. I hope this might be useful to current or future contestants, as a supplement to the concise official solutions available. It goes without saying that while these commentaries may be interesting at a general level, they will be much more educational to students who have at least digested and played around with the questions, so consider trying the paper first. Video solutions are available here. These have more in common with this blog post than the official solutions, though inevitably some of the methods are slightly different, and the written word has some merits and demerits over the spoken word for clarity and brevity.

The copyright for these questions lies with BMOS, and are reproduced here with permission. Any errors or omissions are obviously my own.

I found the paper overall quite a bit harder than in recent years, or at least harder to finish quickly. I’ve therefore postponed discussion of the final two problems to a second post, to follow shortly.

Question One

A recurring theme of Q1 from BMO1 in recent years has been: “it’s possible to do this problem by a long, and extremely careful direct calculation, but additional insight into the setup makes life substantially easier.”

This is the best example yet. It really is possible to evaluate Helen’s sum and Phil’s sum, and compare them directly. But it’s easy to make a mistake in recording all the remainders when the divisor is small, and it’s easy to make a mistake in summation when the divisor is large, and so it really is better to have a think for alternative approaches. Making a mistake in a very calculation-heavy approach is generally penalised heavily. And this makes sense intellectually, since the only way for someone to fix an erroneous calculation is to repeat it themselves, whereas small conceptual or calculation errors in a less onerous solution are more easily isolated and fixed by a reader. Of course, it also makes sense to discourage such attempts, which aren’t really related to enriching mathematics, which is the whole point of the exercise!

Considering small divisors (or even smaller versions of 365 and 366) is sometimes helpful, but here I think a ‘typical’ divisor is more useful. But first, some notation will make any informal observation much easier to turn into a formal statement. Corresponding to Helen and Phil, let h(n) be the remainder when n is divided by 365, and p(n) the remainder when n is divided by 366. I would urge students to avoid the use of ‘mod’ in this question, partly because working modulo many different bases is annoying notationally, partly because the sum is not taken modulo anything, and partly because the temptation to use mod incorrectly as an operator is huge here [1].

Anyway, a typical value might be n=68, and we observe that 68 x 5 + 25 = 365, and so h(68)=25 and p(68)=26. Indeed, for most values of n, we will have p(n)=h(n)+1. This is useful because

p(1)+p(2)+\ldots+p(366) - \left(h(1)+h(2)+\ldots+h(365)\right)

= \left(p(1)-h(1)\right) + \ldots+\left(p(365)-h(365)\right) + p(366),

and now we know that most of the bracketed terms are equal to one. We just need to handle the rest. The only time it doesn’t hold that p(n)=h(n)+1 is when 366 is actually a multiple of n. In this case, p(n)=0 and h(n)=n-1. We know that 366 = 2 x 3 x 61, and so its divisors are 1, 2, 3, 6, 61, 122, 183.

Then, in the big expression above, seven of the 365 bracketed terms are not equal to 1. So 358 of them are equal to one. The remaining ones are equal to 0, -1, -2, -5, -60, -121, -182 respectively. There are shortcuts to calculate the sum of these, but it’s probably safer to do it by hand, obtaining -371. Overall, since p(366)=0, we have

p(1)+p(2)+\ldots+p(366) - \left(h(1)+h(2)+\ldots+h(365)\right)

= -371 + 358 + 0 = -13.

So, possibly counter-intuitively, Helen has the larger sum, with difference 13, and we didn’t have to do a giant calculation…

Question Two

Suppose each person chooses which days to go swimming ‘at random’, without worrying about how to define this. Is this likely to generate a maximum or minimum value of n? I hope it’s intuitively clear that this probably won’t generate an extreme value. By picking at random we are throwing away lots of opportunity to force valuable overlaps or non-overlaps. In other words, we should start thinking about ways to set up the swimming itinerary with lots of symmetry and structure, and probably we’ll eventually get a maximum or a minimum. At a more general level, with a problem like this, one can start playing around with proof methods immediately, or one can start by constructing lots of symmetric and extreme-looking examples, and see what happens. I favour the latter approach, at least initially. You have to trust that at least one of the extreme examples will be guess-able.

The most obvious extreme example is that everyone swims on the first 75 days, and no-one swims on the final 25 days. This leads to n=75. But we’re clearly ‘wasting’ opportunities in both directions, because there are never exactly five people swimming. I tried a few more things, and found myself simultaneously attacking maximum and minimum, which is clearly bad, so focused on minimum. Just as a starting point, let’s aim for something small, say n=4. The obstacle is that if you demand at most four swimmers on 96 days, then even with six swimmers on the remaining four days, you don’t end up with enough swimming having taken place!

Maybe you move straight from this observation to a proof, or maybe you move straight to a construction. Either way, I think it’s worth saying that the proof and the construction come together. My construction is that everyone swims on the first 25 days, then on days 26-50 everyone except A and B swim, on days 51-75 everyone except C and D swim, and on days 76-100 everyone except E and F swim. This exactly adds up. And if you went for the proof first, you might have argued that the total number of swim days is 6×75 = 450, but is at most 4n + 6(100-n). This leads immediately to n\ge 25, and I just gave the construction. Note that if you came from this proof first, you can find the construction because your proof shows that to be exact you need 25 days with six swimmers, and 75 days with four swimmers, and it’s natural to try to make this split evenly. Anyway, this clears up the minimum.

[Less experienced contestants might wonder why I was worried about generating a construction despite having a proof. Remember we are trying to find the minimum. I could equally have a proof for n\ge 10 which would be totally totally valid. But this wouldn’t show that the minimum was n=10, because that isn’t in fact possible (as we’ve seen), hence it’s the construction that confirms that n=25 is the true minimum.]

It’s tempting to go back to the drawing board for the maximum, but it’s always worth checking whether you can directly adjust the proof you’ve already given. And here you can! We argued that

450\le 4n + 6(100-n)

to prove the minimum. But equally, we know that on the n days we have at least five swimmers, and on the remaining days, we have between zero and four swimmers, so

450 \ge 5n + 0\times (100-n), (*)

which gives n\le 90. If we have a construction that attains this bound then we are done. Why have I phrased (*) with the slightly childish multiple of zero? Because it’s a reminder that for a construction to attain this bound, we really do need the 90 days to have exactly five swimmers, and the remaining ten days to have no swimmers. So it’s clear what to do. Split the first 90 days into five groups of 15 days. One swimmer skips each group. No-one swims in the final ten days, perhaps because of a jellyfish infestation. So we’re done, and 25\le n\le 90.

At a general level, it’s worth noting that in the story presented, we found an example for the minimum which we turned into a proof, and then a proof for the maximum, which we then analysed to produce a construction.

Note that similar bounding arguments would apply if we fiddled with the numbers 5, 75 and 100. But constructions matching the bounds might not then be possible because the splits wouldn’t work so nicely. This would make everything more complicated, but probably not more interesting.

Question Three

It’s understandable that lots of students attempting this paper might feel ill-at-ease with conventional Euclidean geometry problems. A good first rule of thumb here, as in many settings, is “don’t panic!”, and a more specific second rule of thumb is “even if you think you can calculate, try to find geometric insight first.”

Here, it really does look like you can calculate. A configuration based on a given isosceles triangle and a length condition and a perpendicular line is open to several coordinate approaches, and certainly some sensible trigonometry. It’s also very open to organised labelling of the diagram. You have three equal lengths, and a right-angle, as shown.

The key step is this. Drop the perpendicular from A to BC, and call its foot D. That alone really is the key step, as it reduces both parts of the question to an easy comparison. It’s clear that the line AD splits the triangle into two congruent parts, and thus equal areas and perimeters. So it is enough to show that triangle BMN has the same area as triangle ABD, and that their outer-perimeters (ie the part of its perimeter which is also the perimeter of ABC) are the same.

But they’re congruent, so both of these statements are true, and the problem is solved.

My solution could be as short as two or three lines, so for the purposes of this post all that remains is to justify why you might think of the key step. Here are a few possible entry routes:

  • You might notice that line AD induces the required property for triangle ABD.
  • You might try to find a triangle congruent to AMN, and come up with D that way.
  • There’s already a perpendicular in the question so experimenting with another one is natural, especially since the perpendicular from A has straightforward properties.
  • AMN is a right angle, and so constructing D gives a cyclic quadrilateral. We didn’t use that directly in the proof above, but constructing cyclic quadrilaterals is usually a good idea.
  • If you were trying a calculation approach, you probably introduced the length AD, or at least the midpoint D as an intermediate step.

On the video, Mary Teresa proposes a number of elegant synthetic solutions with a few more steps. You might find it a useful exercise to try to come up with some motivating reasons like the bullet points above to justify her suggestion to reflect A in M as a first step.

Question Four

BMO1 2017Q4

I wasn’t paying enough attention initially, and I calculated a_2=0\text{ or }2. This made life much much more complicated. As with IMO 2017 Q1, if trying to deduce general behaviour from small examples, it’s essential to calculate the small examples correctly!

Once you engage your brain properly, you find that a_2=0 \text{ or }3, and of course a_2=0 is not allowed, since it must be positive. So a_2=3, and a similar calculation suggests a_3=1\text{ or }6. It’s clear that the set of values for a_{k+1} depends only on a_k, so if you take a_3=1, then you’re back to the situation you started with at the beginning. If you choose to continue the exploration with a_3=6, you will find a_4=2\text{ or }10, at which point you must be triggered by the possibility that triangle numbers play a role here.

As so often with a play-around with small values, you need to turn a useful observation into a concrete statement, which could then be applied to the problem statement. It looks like in any legal sequence, every term will be a triangle number, so we only need to clarify which triangle number. An example of a suitable statement might be:

Claim: If a_n=T_k=\frac{k(k+1)}{2}, the k-th triangle number, then a_{n+1}=T_{k-1}\text{ or }T_{k+1}.

There are three stages. 1) Checking the claim is true; 2) checking the claim is maximally relevant; 3) proving it. In this case, proving it is the easiest bit. It’s a quick exercise, and I’m omitting it. Of course, we can’t prove any statement which isn’t true, and here we need to make some quick adjustment to account for the case k=1, for which we are forced to take a_{n+1}=T_{k+1}.

The second stage really concerns the question “but what if a_n\ne T_k?” While there are deductions one could make, the key is that if a_1 is a triangle number, the claim we’ve just made shows that a_n is always a triangle number, so this question is irrelevant. Indeed the claim further shows that a_{2017}\le T_{2017}, and also that a_{2017}=T_k for some odd value of k. To be fully rigorous you should probably describe a sequence which attains each odd value of k, but this is really an exercise in notation [2], and it’s very obvious they are all attainable.

In any case, the set of possible values is \{T_1,T_3,\ldots,T_{2017}\}, which has size 1009.

Final two questions

These are discussed in a subsequent post.

Footnotes

[1] – mod n is not an operator, meaning you shouldn’t think of it as ‘sending integers to other integers’, or ‘taking any integer, to an integer in {0,1,…,n-1}’. Statements like 19 mod 5 = 4 are useful at the very start of an introduction to modular arithmetic, but why choose 4? Sometimes it’s more useful to consider -1 instead, and we want statements like a^p\equiv a modulo p to make sense even when a\ge p. 19 = 4 modulo 5 doesn’t place any greater emphasis on the 4 than the 19. This makes it more like a conventional equals sign, which is of course appropriate.

[2] – Taking a_n=T_n for $1\le n\le k$, and thereafter a_n=T_k if k is odd, and $a_n=T_{k+1}$ if k is even will certainly work, as will many other examples, some perhaps easier to describe than this one, though make sure you don’t accidentally try to use T_0!

Balkan MO 2017 – Qs 1, 3 and 4

The UK is normally invited to participate as a guest team at the Balkan Mathematical Olympiad, an annual competition between eleven countries from South-Eastern Europe. I got to take part in Rhodes almost exactly ten years ago, and this year the competition was held in Ohrid, in Macedonia. There’s one paper, comprising four questions, normally one from each of the agreed olympiad topic areas, with 4.5 hours for students to address them. The contest was sat this morning, and I’m going to say quite a bit about the geometric Q2, and a little bit about Qs 1 and 3 also. In all cases, this discussion will include most of a solution, with some commentary, so don’t read these if you are planning to try the problems yourself.

I’m not saying anything about Q4, because I haven’t solved it. (Edit: I have solved it now, so will postpone Q2 until later today.)

Question One

Find all ordered pairs of positive integers (x,y) such that

x^3+y^3=x^2+42xy+y^2.

The first thought is that if either of x or y is ‘large’, then the LHS is bigger than the RHS, and so equality can’t hold. That is, there are only finitely many solutions. The smallest possible value of y is, naturally, 1, and substituting y=1 is convenient as then y^2=y^3, and it’s straightforward to derive x=7 as a solution.

Regarding the non-existence of large solutions, you can make this precise by factorising the LHS as

(x+y)(x^2-xy+y^2) = x^2+42xy+y^2.

There are 44 terms of degree two on the RHS, and one term of degree in the second bracket on the LHS. With a bit of AM-GM, you can see then that if x+y>44, you get a contradiction, as the LHS will be greater than the RHS. But that’s still a lot of possibilities to check.

It struck me that I could find ways to reduce the burden by reducing modulo various primes. 2, 3 and 7 all divide 42, and furthermore cubes are nice modulo 7 and squares are nice modulo 3, so maybe that would bring the number of possibilities down. But my instinct was that this wasn’t the right way to use the fact that we were solving over positive integers.

The second bracket in the factorisation looks enough like the RHS, that it’s worth exploring. If we move x^2-xy+y^2 from the right to the left, we get

(x+y-1)(x^2-xy+y^2) = 43xy. (1.1)

Now it suddenly does look useful that we are solving over positive integers, because 43 is a prime, so has to appear as a factor somewhere on the LHS. But it’s generally quite restrictive that x^2-xy+y^2 | 43xy. This definitely looks like something that won’t hold often. If x and y are coprime, then certainly x^2-xy+y^2 and y are coprime also. But actually if x and y have a non-trivial common factor d, we can divide both sides by d^2, and it still holds. Let’s write

x=dm,\quad y=dn,\quad\text{where }d=\mathrm{gcd}(x,y).

Then m^2 -mn+n^2 really does divide 43, since it is coprime to both m and n. This is now very restrictive indeed, since it requires that m^2-mn+n^2 be equal to 1 or 43. A square-sandwiching argument gives m^2-mn+n^2=1 iff m=n=1. 43 requires a little bit more work, with (at least as I did it) a few cases to check by hand, but again only has one solution, namely m=7, n=1 and vice versa.

We now need to add the common divisor d back into the mix. In the first case, (1.1) reduces to (2d-1)=43, which gives (x,y)=(22,22). In the second case, after cancelling a couple of factors, (1.1) reduces to (8d-1)=7, from which (x,y)=(7,1),(1,7) emerges, and these must be all the solutions.

The moral here seemed to be that divisibility was a stronger tool than case-reduction. But that was just this question. There are other examples where case-reduction is probably more useful than chasing divisibility.

Question Three

Find all functions f:\mathbb{N}\rightarrow\mathbb{N} such that

n+f(m) \,\big|\, f(n)+nf(m)

for all m,n\in\mathbb{N}.

What would be useful here? There are two variables, and a function. It would be useful if we could reduce the number of variables, or the number of occurences of f. We can reduce the number of variables by taking m=n, to get

n+f(n) \,\big|\, f(n) [1+n]. (3.1)

From this, we might observe that f(n)\equiv 1 is a solution. Of course we could analyse this much more, but this doesn’t look like a 10/10 insight, so I tried other things first.

In general, the statement that a\,|\,b also tells us that a\,|\, b-ka. That is, we can subtract arbitrary multiples of the divisor, and the result is still true. A recurring trope is that the original b is elegant, but an adjusted b-ka is useful. I don’t think we can do the latter, but by subtracting n^2 +nf(m) from the problem statement, we get

n+f(m) \,\big|\, n^2-f(n). (3.2)

There’s now no m on the RHS, but this relation has to hold for all m. One option is that f(n)=n^2 everywhere, then what we’ve deduced always holds since the RHS is zero. But if there’s a value of n for which f(n)\ne n^2, then (3.2) is a very useful statement. From now on, we assume this. Because then as we fix n and vary m, we need n+f(m) to remain a divisor of the RHS, which is fixed, and so has finitely many divisors. So f(m) takes only finitely many values, and in particular is bounded.

This ties to the observation that f\equiv 1 is a solution, which we made around (3.1), so let’s revisit that: (Note, there might be more elegant ways to finish from here, but this is what I did. Also note, n is no longer fixed as in previous paragraph.)

n+f(n) \,\big|\, f(n) [1+n]. (3.1)

Just to avoid confusion between the function itself, and one of the finite collection of values it might take, let’s say b is a value taken by f. So there are values of n for which

n+b \,\big|\, b(1+n).

By thinking about linear equations, you might be able to convince yourself that there are only finitely many solutions (in n) to this relation. There are certainly only finitely many solutions where LHS=RHS (well, at most one solution), and only finitely many where 2xLHS=RHS etc etc. But why do something complicated, when we can actually repeat the trick from the beginning, and subtract b(n+b), to obtain

n+b \,\big|\, b^2-b.

For similar reasons to before, this is a great deduction, because it means if b\ne 1, then the RHS is positive, which means only finitely many n can satisfy this relation. Remember we’re trying to show that no n can satisfy this relation if b\ne 1, so this is definitely massive progress!

If any of what’s already happened looked like magic, I hope we can buy into the idea that subtracting multiples of the divisor from the RHS is the only tool we used, and that making the RHS fixed gives a lot of information about the LHS as the free variable varies. The final step is not magic either. We know that f is eventually 1. If you prefer “for large enough n, f(n)=1,” since all other values appear only finitely often. I could write this with quantifiers, but I don’t want to, because that makes it seem more complicated than it is. We genuinely don’t care when the last non-1 value appears.

Anyway, since we’ve deduced this, we absolutely have to substitute this into something we already have. Why not the original problem statement? Fix m, then for all large enough n

n+f(m) \,\big|\, 1+nf(m). (3.3)

To emphasise, (3.3) has to hold for all large enough n. Is it possible that f(m)=2? Again, it’s easy to convince yourself not. But, yet again, why not use the approach we’ve used so profitably before to clear the RHS? In fact, we already did this, and called it (3.2), and we can make that work [3.4], but in this setting, because f(m) is fixed and we’re working with variable large n, it’s better to eliminate n, to get

n+f(m)\,\big|\, f(m)^2-1,

again for all large enough n. By the same size argument as before, this is totally impossible unless f(m)=1. Which means that in fact f(m)=1 for all m. Remember ages ago we assumed that f(n) was not n^2 everywhere, so this gives our two solutions: f(n)=1,\, f(n)=n^2.

Moral: choosing carefully which expression to work with can make life much more interesting later. Eliminating as many variables or difficult things from one side is a good choice. Playing with small values can help you understand the problem, but here you need to think about soft properties of the expression, in particular what happens when you take one variable large while holding another fixed.

[3.4] – if you do use the original approach, you get n^2-1 on the RHS. There’s then the temptation to kill the divisibility by taking n to be the integer in the middle of a large twin prime pair. Unfortunately, the existence of such an n is still just a conjecture

Question Four

(Statement copied from Art of Problem Solving. I’m unsure whether this is the exact wording given to the students in the contest.)

On a circular table sit n>2 students. First, each student has just one candy. At each step, each student chooses one of the following actions:

(A) Gives a candy to the student sitting on his left or to the student sitting on his right.

(B) Separates all its candies in two, possibly empty, sets and gives one set to the student sitting on his left and the other to the student sitting on his right.

At each step, students perform the actions they have chosen at the same time. A distribution of candy is called legitimate if it can occur after a finite number of steps.
Find the number of legitimate distributions.

My moral for this question is this: I’m glad I thought about this on the bus first. What I found hardest here was getting the right answer. My initial thoughts:

  • Do I know how to calculate the total number of possibilities, irrespective of the algorithm? Fortunately yes I do. Marbles-in-urns = barriers between marbles on a line (maybe add one extra marble per urn first). [4.1]
  • What happens if you just use technique a)? Well first you can get into trouble because what happens if you have zero sweets? But fine, let’s temporarily say you can have a negative number of sweets. If n is even, then there’s a clear parity situation developing, as if you colour the children red and blue alternately, at every stage you have n/2 sweets moving from red children to blue and vice versa, so actually the total number of sweets among the red children is constant through the process.
  • What happens if you just use technique b)? This felt much more promising.
  • Can you get all the sweets to one child? I considered looking at the child directly opposite (or almost-directly opposite) and ‘sweeping’ all the sweets away from them. It felt like this would work, except if for some parity reason we couldn’t prevent the final child having one (or more, but probably exactly one) sweets at the crucial moment when all the other sweets got passed to him.

Then I got home, and with some paper, I felt I could do all possibilities with n=5, and all but a few when n=6. My conjecture was that all are possible with n odd, and all are possible with n even, except those when none of the red kids or none of the kids get a sweet. I tried n=8, and there were a few more that I couldn’t construct, but this felt like my failure to be a computer rather than a big problem. Again there’s a trade-off between confirming your answer, and trying to prove it.

Claim: If n is even, you can’t achieve the configurations where either the red children or the blue children have no sweets.

Proof: Suppose you can. That means there’s a first time that all the sweets were on one colour. Call this time T. Without loss of generality, all the sweets are on red at T. Where could the sweets have been at time T-1? I claim they must all have been on blue, which contradicts minimality. Why? Because if at least one red child had at least one sweet, they must have passed at least one sweet to a blue neighbour.

Now it remains to give a construction for all other cases. In the end, my proof has two stages:

Step One: Given a configuration, in two steps, you can move a candy two places to the right, leaving everything else unchanged.

This is enough to settle the n odd case. For the even case, we need an extra step, which really corresponds to an initial phase of the construction.

Step Two: We can make some version of the ‘sweeping’ move precise, to end up in some configuration where the red number of children have any number of sweets except 0 or n.

Step one is not so hard. Realising that step one would be a useful tool to have was probably the one moment where I shifted from feeling like I hadn’t got into the problem to feeling that I’d mostly finished it. As ever in constructions, working out how to do a small local adjustment, which you plan to do lots of times to get a global effect, is great. (Think of how you solve a Rubik’s cube for example.)

Step two is notationally fiddly, and I would think very carefully before writing it up. In the end I didn’t use the sweeping move. Instead, with the observation that you can take an adjacent pair and continually swap their sweets it’s possible to set up an induction.

Actual morals: Observing the possibility to make a small change in a couple of moves (Step one above) was crucial. My original moral does still hold slightly. Writing lots of things down didn’t make life easier, and in the end the ideas on the bus were pretty much everything I needed.

[4.1] – one session to a group of 15 year olds is enough to teach you that the canon is always ‘marbles in urns’ never ‘balls’ nor ‘bags’, let alone both.