How to Prove Fermat’s Little Theorem

The following article was prompted by a question from one of my mentees on the Senior Mentoring Scheme. A pdf version is also available.

Background Ramble

When students first meet problems in number theory, it often seems rather different from other topics encountered at a similar time. For example, in Euclidean geometry, we immediately meet the criteria for triangle similarity or congruence, and various circle theorems. Similarly, in any introduction to inequalities, you will see AM-GM, Cauchy-Schwarz, and after a quick online search it becomes apparent that these are merely the tip of the iceberg for the bewildering array of useful results that a student could add to their toolkit.

Initially, number theory lacks such milestones. In this respect, it is rather like combinatorics. However, bar one or two hugely general ideas, a student gets better at olympiad combinatorics questions by trying lots of olympiad combinatorics questions.

I don’t think this is quite the case for the fledgling number theorist. For them, a key transition is to become comfortable with some ideas and notation, particularly modular arithmetic, which make it possible to express natural properties rather neatly. The fact that multiplication is well-defined modulo n is important, but not terribly surprising. The Chinese Remainder Theorem is a theorem’ only in that it is useful and requires proof. When you ask a capable 15-year-old why an arithmetic progression with common difference 7 must contain multiples of 3, they will often say exactly the right thing. Many will even give an explanation for the regularity in occurrence of these which is precisely the content of the theorem. The key to improving number theory problem solving skills is to take these ideas, which are probably obvious, but sitting passively at the back of your mind, and actively think to deploy them in questions.

Fermat’s Little Theorem

It can therefore come as a bit of a shock to meet your first non-obvious (by which I mean, the first result which seems surprising, even after you’ve thought about it for a while) theorem, which will typically be Fermat’s Little Theorem. This states that:

$\text{For a prime }p,\text{ and }a\text{ any integer:}\quad a^p\equiv a\mod p.$ (1)

Remarks

• Students are typically prompted to check this result for the small cases p=3, 5 and 7. Trying p=9 confirms that we do really need the condition that p be prime. This appears on the 2012 November Senior Mentoring problem sheet and is a very worthwhile exercise in recently acquired ideas, so I will say no more about it here.
• Note that the statement of FLT is completely obvious when a is a multiple of p. The rest of the time, a is coprime to p, so we can divide by a to get the equivalent statement:

$\text{For a prime }p,\text{ and }a\text{ any integer coprime to }p:\quad a^{p-1}\equiv 1\mod p.$ (2)

• Sometimes it will be easier to prove (2) than (1). More importantly, (2) is sometimes easier to use in problems. For example, to show $a^{p^2}\equiv a \mod p$, it suffices to write as:

$a^{p^2}\equiv a^{(p-1)(p+1)+1}\equiv (a^{p-1})^{p+1}\times a\equiv 1^{p+1}\times a \equiv a.$

• A word of warning. FLT is one of those theorems which it is tempting to use on every problem you meet, once you know the statement. Try to resist this temptation! Also check the statement with small numbers (eg p=3 ,a=2) the first few times you use it, as with any new theorem. You might be surprised how often solutions contain assertions along the lines of

$a^p\equiv p \mod (a-1).$

Proofs

I must have used FLT dozens of times (or at least tried to use it – see the previous remark), before I really got to grips with a proof. I think I was daunted by the fact that the best method for, say, p=7, a careful systematic check, would clearly not work in the general case. FLT has several nice proofs, and is well worth thinking about for a while before reading what follows. However, I hope these hints provide a useful prompt towards discovering some of the more interesting arguments.

Induction on a to prove (1)

• Suppose $a^p\equiv a\mod p$. Now consider $(a+1)^p$ modulo p.
• What happens to each of the (p+1) terms in the expansions?
• If necessary, look at the expansion in the special case p=5 or 7, formulate a conjecture, then prove it for general p.

Congruence classes modulo p to prove (2)

• Consider the set $\{a,2a,3a,\ldots,(p-1)a\}$ modulo p.
• What is this set? If the answer seems obvious, think about what you would have to check for a formal proof.
• What could you do now to learn something about $a^{p-1}$?

Combinatorics to prove (1)

• Suppose I want a necklace with p beads, and I have a colours for these beads. We count how many arrangements are possible.
• Initially, I have the string in a line, so there are p labelled places for beads. How many arrangements?
• Join the two ends. It is now a circle, so we don’t mind where the labelling starts: Red-Green-Blue is the same as Green-Blue-Red.
• So, we’ve counted some arrangements more than once. How many have we counted exactly once?
• How many have we counted exactly p times? Have we counted any arrangements some other number of times?

Group Theory to prove (2)

This is mainly for the interest of students who have seen some of the material for FP3, or some other introduction to groups.

• Can we view multiplication modulo p as a group? Which elements might we have to ignore to ensure that we have inverses?
• What is $\{1,a,a^2,a^3,\ldots\}$ in this context? Which axiom is hardest to check?
• How is the size of the set of powers of a modulo p related to the size of the whole group of congruences?
• Which of the previous three proofs is this argument is most similar to this one?
• Can you extend this to show the Fermat-Euler Theorem:

$\text{For any integer }n,\text{ and }a\text{ coprime to }n:\quad a^{\phi(n)}\equiv 1 \mod n,$

where $\phi(n)$ counts how many integers between 1 and n are coprime to n.

The first round of the British Mathematical Olympiad (BMO1) takes place tomorrow. Last year I wrote a brief note to my mentoring students about the exam. Most of the advice is fairly obvious, but I guess it never does any harm to be reminded. In particular, while it is tempting to give lots of mathematical guidance, under exam pressure good deductive ideas either will or won’t come, and there’s relatively little to be done about it in advance to help. However, especially for students for whom this is their first experience of a long olympiad style paper, there are a few practical and general points to be made, so you have the best chance of turning good ideas into good solutions during the time allowed.

DON’T waste time. 3.5 hours is a long time, but it will pass quickly when you have lots to think about. Obviously, you will inevitably spend some time just thinking vaguely about the problems, or even daydreaming, just to give your brain a break. Don’t worry about that, but do try not to waste time pursuing methods which don’t look like they are working. If you have made 6 algebraic substitutions and the expression now takes up an entire line, ask yourself whether you’re going anywhere. If your geometrical diagram now has dozens of extra points, or if you are trying to solve a polynomial in n variables where n is large, question yourself. Maybe you’re missing something more obvious?

On the subject, DO flit between questions. The rubric says that full solutions are better than partial solutions. However, if moving to another question allows you to take a fresh stab at the first one in 15 minutes or whatever, that is a good thing.

Also, DO take food or drink (within reason and so long as whoever is invigilating doesn’t mind), if you think it will help. 3.5 hours of concentration can be draining! The 200g value pack of Dairy Milk was my preference back in the day…

On a more mathematical note, DON’T draw rubbish geometrical diagrams. DO use a compass and a ruler. These geometry problems normally want you to spot similar triangles or something like that. These will be much much easier to find if they actually look similar on your diagram! Markers also like seeing good diagrams.

DO write up relevant bits of your rough. It’s a good way to grab small marks, and you never know, you might have had all the right ideas, just missed the final crucial step. It sometimes says not to hand in rough: so make sure what you hand in looks vaguely neat, and has key steps or results you’ve proved underlined or in a box, so that they are as visible as possible to the marker. Checking small cases explicitly will be useful to your understanding of the problem, and so may gain credit.

DON’T wait until the end to write up bits of your rough. The temptation to keep working on them will be too strong, and you might have forgotten what seemed interesting an hour ago. Crucially, deciding carefully what the most important steps of your working are may very well help you to finish the problem.

DO read the question properly. Trying to prove something false will waste your time; trying to prove something simpler than the actual question will cost you marks. Things to consider include:

• If the question says ‘If and only if’, you have to prove it both ways. Similarly if it asks for a converse.
• Check what the domains are. Does n have to be an integer or is it a real number? Can it be zero?
• In a counting question, does order matter?
• Is the triangle allowed to be obtuse? Does this change anything important in the argument?

DON’T waffle. If you are writing a massive load of text, have a think about whether that’s a good idea. It is very easy, especially for fiddly combinatorics questions, for a simple equation to turn into a sprawling essay. Keeping sentences very short (no long subordinate clauses) and leaving space between displayed maths and words will help. Remember that whether or not you know what you are doing, you want to GIVE THE IMPRESSION that you know what you are doing!

DO be clever. Sometimes the questions are hard but routine, sometimes they require clever ideas. If your current method isn’t making any progress and you have a crazy idea, try it – it might be just the thing.

However, DON’T be too clever. It’s very tempting, especially to new mentoring students, to try to use every bit of theory you’ve recently learned. Remember that not every geometry question requires the Angle Bisector Theorem, and you don’t always need to deploy Fermat’s Little Theorem or even modular arithmetic on every problem about integers. In particular, avoid applying anything you don’t properly understand – under the pressure of an exam, it’s easy to forget the details, and end up assuming something that is false!

DO relax. I know that is easier said than done, but this is an academically stressful time of life, so enjoy the fact that this is a rare exam where doing well is not of huge importance to the rest of your life. I haven’t seen this year’s paper, but the questions are normally interesting, and should bring out the best in a strong young mathematician. As with many things, if you stop worrying about the outcome, you often do better than you might expect.

Best of luck to everyone sitting the exam tomorrow!

Modular arithmetic – Beyond the Definitions

Modular arithmetic is a relatively simple idea to define. The natural motivation is to consider a clock. The display of a standard analogue clock makes no distinction between 4am, 4pm, and 4pm next Thursday. This is a direct visualisation of the integers modulo 12. Instead of counting in the usual way, where each successive integer is different to all those considered previously, here, every time we get to a multiple of 12, we reset our count back to zero. As a result, this procedure is often referred to as clock arithmetic’.

A common problem for good students, for example those starting the UKMT’s Senior Mentoring Scheme, is that the idea of modular arithmetic seems very simple, but it’s hard to work out how it might be useful in application to problems. My claim is that the language of modular arithmetic is often the best way to discuss divisibility properties in problems about whole numbers. In particular, the behaviour of powers (ie $m^n$ and so forth) is nice in this context, and the notation of modular arithmetic is the only sensible way to approach it. Anyway, let’s begin with a quick review of the definitions.

Definitions

We are interested in divisibility by some fixed integer $n\geq 2$, and the remainders given after we divide by n. Given an integer M, we can write this as:

$M=kn+a,\quad\text{ where }a=0,1,\ldots,n-1,$

and this decomposition is unique. We then say that M is congruent to a modulo n. Note that working modulo n, means that we are interested in remainders after division by n (rather than a or k or M or anything else). This has the feeling of a function or algorithm applied to M. We get told what M is, then work out the remainder after division by n, and say that this is M mod n‘.

This is fine, but it very much worth bearing in mind a slightly different interpretation. Working modulo n is a way of saying that we aren’t interested in the exact value of an integer, only where it lies on the n-clock. In particular, this means we are viewing lots of integers as the same. The sameness’ is actually more important in lots of arguments than the position on the n-clock.

More formally, we say that $a\equiv b$ or a is congruent to b modulo n, if they have the same remainder after division by n. Another way of writing this is that

$a\equiv b\quad \iff \quad n|a-b.$

Sets of integers which are equivalent are called congruence classes. For example $\{\ldots,-4,-1,2,5,8,\ldots\}$ is a congruence class modulo 3. Note that under the first definition, we consider all elements here to be congruent to 2, but in a particular question it may be more useful to consider elements congruent to -1, or some combination.

These definitions are equivalent, but it can be more useful to apply this second definition for proving things, rather than writing out $b=kn+a$ or whatever all the time.

Everything has been set up in terms of addition, so it is easy to see that addition works well on congruence classes. That is:

$\text{If }a\equiv b,c\equiv d,\quad\text{then }a+c\equiv b+d.$

We could argue via a clock argument, but the second definition works very well here:

$\text{We have }n|a-b,n|c-d,\quad\text{and so }n|(a+c)-(b+d),\text{ exactly as required.}$

We want to show that a similar result happens for multiplication. But this holds as well:

$\text{If }a\equiv b,c\equiv d,\quad\text{then }n|c(b-a)\text{ and }n|b(c-d).$

$\Rightarrow n|ac-bd,\text{ that is }ac\equiv bd.$

Let’s just recap what this means. If we want to know what $157\times 32$ is modulo 9, it suffices to say that $157\equiv 4$ and $32\equiv 5$, and so $157\times 32\equiv 4\times 5\equiv 2$. In a more abstract setting, hopefully the following makes sense:

$\text{ODD}\times\text{ODD}=\text{ODD};\quad \text{EVEN}\times\text{EVEN}=\text{EVEN};\quad \text{ODD}\times\text{EVEN}=\text{EVEN}.$

This is exactly the statement of the result in the case n=2, where the congruence classes are the odd integers and the even integers.

Powers

What happens if we try to extend to powers? Is it the case that

$\text{if }a\equiv b,c\equiv d,\quad\text{then }a^c\equiv b^d?$ Continue reading

Senior Mentoring – Solving equations in integers

At school, we learn how to solve equations like $4x+3=-5$. Sometimes the answer is an integer (that is, a whole number), sometimes it isn’t. If we change that -5 to a -6, the solution to the equation above won’t be an integer any more. What is more, if we change the 5 to $\sqrt{5}$, the solution won’t even be rational (that is, a normal fraction that looks like $\frac{p}{q}$). But, the key point is that in the introduction to the chapter on linear equations, or whatever they call it, in a textbook, there will be an explanation of how to solve this generic class of problems, which will work whether the numbers in the equation are integers, fractions, or even complex numbers!

Later, we move on to quadratic equations. Spotting factorisations is a good way to get started, then you could learn the quadratic formula to use in those cases where the factors are harder to spot. Alternatively, there’s the method of ‘completing the square’, which takes more steps, but means you can do the arithmetic in clear stages. Then, if you think about why the quadratic formula works, you realise that you are just completing the square for all quadratics (using a, b, c in place of any numerical coefficients). In other words, though there are a few different ways of approaching the problem, the sensible methods are essentially the same. And then you might get problems about logarithms or matrices for which you use your new knowledge about these objects to turn the question into a quadratic equation, say. At all times, there’s an implicit assumption that any question can be turned into an equation which you ‘know how to solve’.

So when you first come across a problem which asks you to find integer solutions to an equation, as have appeared a few times on this year’s Senior Mentoring scheme, it is hard to know where to start. You could try to solve it as if it wasn’t about integers, then select the integer solutions at the end. But this could be difficult, or might not even make sense. You might have a condition that x is odd: what does this mean if x isn’t even an integer? You also often start making sensible-looking substitutions: x is odd, so write x=2n+1 for some n which is also an integer, then work with n. But this can sometimes cause you to end with a huge number of variables which you can’t really relate to, and horribly complicated expressions.

As with so much of maths, experience is very useful. If you’ve seen solutions to twenty moderately tricky integer equations, you’ll have more ideas about how to think about starting another problem. So here are a few tricks that might come in handy.

Smallest solution: Are you stuck trying to prove there aren’t any solutions? Do all your substitutions just give more complicated versions of the original equation? What about considering the smallest solution? You might have to be careful about what you mean by ‘small’, but suppose you consider the smallest solution x, and end up finding a smaller solution x'<x. What could this mean? Well it can only mean that there isn’t a smallest solution! In other words, there are no solutions. You should think about why the fact that we are working with integers means that if there is any solution there must be a smallest one. Then see if you can find all solutions in positive (>0) integers to $x^2-2y^2=0$. Continue reading