Symmedians and Balkan MO 2017 Q2

While I was away, I wrote about my latest approach to teaching geometry at olympiad camps. This post will end up being about Q2 from the Balkan MO which took place yesterday in Macedonia, but first there is quite a long prelude. My solution, and probably many solutions, to this problem made use of a standard configuration in triangle geometry, namely the symmedian. I want to introduce the configuration, give some simpler examples in practice, and along the way talk about my slightly patched-together philosophy about the merits of practising Euclidean geometry.

The symmedian

Draw a triangle ABC, with A at the top of the page, and extend the rays AB and AC. The median is the line from A through M, the midpoint of BC. Now take points D and E on AB and AC respectively. The following properties are equivalent:

  • DE is parallel to BC;
  • triangle ADE is similar to triangle ABC;
  • the median of ABC passes through the midpoint of DE, and thus is also the median of ADE.

I think it’s a little awkward to prove either of the first two from the third – ratios of areas works – but the rest of the equivalences are straightforward. Later I’m going to talk about the difference between an exercise and a problem. These are all, at best, exercises.

Now take B’ on the ray AC, and C’ on the ray AB such that triangle AB’C’ is similar to triangle ABC. One way to achieve this is to take B’ and C’ to be the reflections in the angle bisector of A of B and C respectively (so then AB’=AB and AC’=AC). We say the line B’C’ is antiparallel to BC, as is any other line DE parallel to B’C’. (Probably this should say ‘with respect to triangle ABC’ or similar, but the context here is very clear, and I want this to seem natural rather than opaque.) Note that DE is an antiparallel line iff BCED is a cyclic quadrilateral. We should remember that, as cyclic quadrilaterals are the signposts for progress in both exercises and problems.

The median of triangle AB’C’ obeys the same equivalences as described above, and so bisects any antiparallel segment. We call the median of triangle AB’C’ the symmedian of triangle ABC. Using the first set of equivalences, the symmedian of triangle ABC bisects any line antiparallel to BC. Furthermore, by construction, the symmedian is the image of the median of ABC under reflection in the bisector of the angle at A. We sometimes say that the symmedian is the isogonal conjugate of the median.

That’s my definition. Note that there was essentially one definition then a couple of easy equivalent definitions. At no point again will I discuss the equivalence of these definitions – we have to take that for granted if we want to get on to more interesting things.

Intersection of tangents + concurrency

Now, in triangle ABC, draw the tangents to the circumcircle at B and C. These meet at P. It turns out that AP is the symmedian. This could have been our definition of a symmedian, but it wasn’t, so let’s quickly prove this.

Trigonometric arguments are very accessible, but I’ll give a Euclidean argument. Draw the antiparallel DE through P, as shown. Our task is to show that EP=PD. At this point, I would again say that this is an exercise.

We colour the angle ABC in green. Two angles around point C share this measure by the alternate segment theorem. The angle at E shares this measure because DE is antiparallel. Therefore CPE is isosceles, and so EP=CP. But CP=BP, so by applying the same argument for the orange angles, we get EP=CP=BP=DP as required.

Pause to regroup. Proving this wasn’t hard, but it was perhaps surprising. If this was all new to you, and I told you to consider the reflection of the median in the angle bisector, you probably wouldn’t instantly exclaim “it goes through the tangent intersection!” So this is a useful piece of knowledge to have gained, in case we ever have to work with the intersection of two tangents like this. Maybe it won’t be useful, but maybe it will. Maybe the statement itself won’t but some extra insights from the proof will be useful, like the fact that we actually showed P is the centre of the circle BCED, and thus angles ECD=EBD=90.

A second property is that in a triangle ABC, the symmedian from A, the symmedian from B and the symmedian from C intersection at, naturally, the symmedian point, which is usually denoted K. This comes from the fact that each symmedian is the isogonal conjugate of the respective median, and the medians are known to concur at the centroid. I’m not going to get into this now.

Configurations – an example

Here’s a problem. Take an isosceles trapezium ABCD as shown (ie throughout I don’t want to worry about alternative diagrams).

Let M be the midpoint of AD, and let E be the point on CM such that angle DBM = EBA. Prove that ABCDE is cyclic.

Well, certainly ABCD is cyclic. So we just need to show E also lies on that circle. And we have two equal angles, but they aren’t in the right place to conclude this immediately. However, we have angle MCA = DBM = EBA, so ABCE is cyclic, and the result follows.

Why is angle MCA = DBM? Well, the isosceles trapezium has an axis of (reflective) symmetry, and MCA is the is image of DBM under that reflection. Simple. If we wanted to do it with congruent triangles, this would all be a bit more laborious. First have to show BD=AC using one set of congruent triangles, then CM=BM using another, finally finishing using DM=MA. This is much less interesting. The symmetry of the configuration is a higher-level observation which could be proved from the axioms of geometry if necessary, but gives us more information more quickly. When we use a configuration like the symmedian configuration, we are really doing a higher-again-level version of this.

Anyway, that problem is fine, but it’s not especially difficult.

Consider instead the following problem. (I saw this online, possibly with slightly different notation, a few days ago and can no longer find the link. If anyone can help, I will add the link.)

Let AB be a chord of a circle, with midpoint M, and let the tangents at A and B meet at P. Consider a line through P which meets the circle at C and D in that order. Extend CM to meet the circle again at E. Show DME is isosceles.

Here’s a diagram, though it includes some clues.

I thought this was a fun problem, and for a while I couldn’t do it because despite lots of equal angles and equal lengths, I couldn’t conjure any congruent triangles in the right places, and I didn’t care enough about solving it to get involved in trigonometry. Then came the moment of insight. We have a midpoint, and also the intersection of the tangents. So DP is the symmedian of triangle DAB, and DM is the median. This gives us the two equal orange angles. Cyclicity gives us an extra equal angle at E as well.

Note now that the situation is very very similar to the previous question (after changing some of the labels), only this time we know ACBDE is cyclic, but don’t know that ABDE is an isosceles trapezium. If ABDE is an isosceles trapezium, we are clearly finished, as then by the same symmetry argument, EM=DM. This direction is probably harder to prove than the direction of the previous problem. Again there are a couple of ways to proceed, but one way is to consider the point E’ such that ABDE’ is an isosceles trapezium, and arguing that E’ lies on the given circle, and the circle through BME, and thus must coincide with E, in a reverse reconstruction argument.

Anyway, this is all slightly a matter of taste, but I would say the second problem is much much more fun than the first problem, even though the second part of the solution is basically the first problem but in a more awkward direction. If you’re going to do Euclidean geometry at all (very much another question), I think you should do questions like the second question wherever possible. And the enjoyable ‘aha moment’ came from knowing about the symmedian configuration. Is it really plausible that you’d look at the original diagram (without the dashed orange lines) and think of the antiparallel to AB in triangle DAB through point P? Probably not. So knowing about the configuration gave access to the good bit of a nice problem.

‘Philosophy of this sort of thing’

If the goal was to solve the second problem in a competition, knowing about the symmedian configuration would be a big advantage. I’ve tried to justify a related alternative view that knowing about the configuration gave access to an enjoyable problem. The question is how many configurations to study, and how hard to study them?

We tend not to think of cyclic quadrilaterals as a special configuration, but that is what they are. We derived circle theorems from the definition of a circle so that we don’t always have to mark on the centre, every single time we have a cyclic quadrilateral. So becoming familiar with a few more is not unreasonable. In particular, there are times when proofs are more important than statements. In research (certainly mine), understanding how various proofs work is the most important aspect, for when you try to extend them or specialise. And in lots of competition problems, the interesting bit is normally finding the argument rather than basking in wonder at the statement (though sometimes the latter is true too!).

To digress briefly. In bridge, I don’t know enough non-obvious motifs in bidding or gameplay to play interesting hands well. I trust that if I thought about some of it very very carefully, I could come up with some of them, especially in gameplay, but not in real time. And it is supposed to be fun right?! Concentrating very very hard to achieve a basic level of competence is not so enjoyable, especially if it’s supposed to be a break from regular work. The end result of this is that I don’t play bridge, which is a shame, because I think the hurdles between where I am currently and a state where I enjoy playing bridge are quite low. If I knew I was going to play bridge regularly, a bit of time reading about conventions would be time well spent. And obviously this applies equally in pursuits which aren’t directly intellectual. Occasionally practising specific skills in isolation broadens overall enjoyment in sport, music, and probably everything. As anyone who’s played in an orchestra knows, there are standard patterns that come up all the time. If you practise these occasionally, you get to a stage where you don’t really need to concentrate that hard in the final movement of Beethoven 5, and instead can listen to the horns, make funny faces at the first violins, and save your mental energy for the handful of non-standard tricky bits. And of course, then move on to more demanding repertoire, where maybe the violas actually get a tune.

This is highly subjective, but my view is that in all these examples are broadly similar to configurations in geometry, and in all of them a little goes a long way.

How? In lots of the geometry configurations you might meet in, for example, a short session at a training camp, most of the conclusions about the configurations have proofs which, like in our symmedian case, are simple exercises. Once you’ve got over some low initial experience hurdles, you have to trust that you can normally solve any simple exercise if required. If you can’t, moving on and returning later, or asking for help is a good policy. The proof shown above that symmedians pass through tangent meet points (and especially a trigonometric alternative) really isn’t interesting enough to spend hours trying to find it. The statements themselves are more useful and interesting here. And it can often be summarised quite quickly: “symmedians are the isogonal conjugates of the medians, so they bisect antiparallels, meet at K, and pass through the alternate tangent meeting points.” Probably having a picture in your mind is even simpler.

There’s a separate question of whether this is worthwhile. I think solving geometry problems occasionally is quite fun, so I guess yes I do think it is worthwhile, but I understand others might not. And if you want to win maths competitions, in the current framework you have to solve geometry problems under time pressure. But from an educational point of view, even though the statements themselves have no real modern research value, I think a) that’s quite a high bar to set, and there’s no a priori reason why they should – >99.9% of things anyone encounters before university have no value to modern research maths; b) in terms of knowledge acquisition, it’s similar in spirit to lots of things that are relevant to later study. I don’t have to solve PDEs very often, but when I do, I hope they are equivalent or similar to one of the small collection of PDEs I do know how to solve. If I worked more with PDEs, the size of this collection would grow naturally, after some initial struggles, and might eventually match my collection of techniques for showing scaling limits of random processes, which is something I need to use often, so the collection is much larger. Maybe that similarity isn’t enough justification by itself, but I think it does mean it can’t be written off as educationally valueless.

Balkan MO 2017 Question Two

An acute angled triangle ABC is given, with AB<AC, and \omega is its circumcircle. The tangents t_B,t_C at B,C respectively meet at L. The line through B parallel to AC meets t_C at D. The line through C parallel to AB meets t_B at E. The circumcircle of triangle BCD meets AC internally at T. The circumcircle of triangle BCE meets AB extended at S. Prove that ST, BC and AL are concurrent.

Ok, so why have I already written 1500 words about symmedians as a prelude to this problem? Because AL is a symmedian. This was my first observation. This observation is then a route into non-Euclidean solutions. It means, for example, that you can describe the point of concurrency fairly explicitly with reference to triangle ABC. If you wish, you can then proceed using areal coordinates. One member of the UK team, whom I know is perfectly capable of finding a synthetic solution, did this. And why not? It’s a competition, and if you can see a method that will definitely work, and definitely take 45 minutes (or whatever) then that’s good.

I was taking a break from work in my office, and had no interest in spending the time evaluating determinants because that isn’t enjoyable at any level, so I focused on the geometry.

I think there’s a good moral from the diagram above, which is the first moderately correct one I drew. I often emphasise that drawing an accurate diagram is important, as it increases the chance that you’ll spot key properties. In this case though, where you’re trying to examine a known configuration, I think it’s more important what you choose to include on your diagram, than how accurately you draw it. (In a moment, we’ll see why it definitely wasn’t very accurate.)

In particular, what’s not on the diagram? E is not on the diagram, and S got added later (as did the equal length signs in TB and CS, which rather spoil what’s about to happen). My first diagram was wildly incorrect, but it also suggested to me that the line ST was hard to characterise, and that I should start by deducing as much as possible about S and T by themselves. So by symmetry, probably it was enough just to deduce as much as possible about T.

Label the angles of triangle ABC as <A, <B, And therefore TB is an antiparallel in triangle ABC. (Note this doesn’t look antiparallel on my diagram at all, but as I said, this didn’t really matter.) Obviously you then guess that CS is also an antiparallel, and on a different diagram I checked this, for essentially the same reasons.

We haven’t yet made any use of the symmedian, but this is clearly where it’ll be useful. Note that if we didn’t know about everything in the prelude, we might well have deduced all of this, but we might not have thought to prove that AL bisects TB unless we’d drawn a very accurate diagram.

At this point, we have to trust that we have enough information to delete most of the diagram, leaving just {A,B,C,S,T} and the line AL. There are a few ways to finish, including similar triangles if you try very hard or trigonometry if you do it right, but again knowledge of some standard configurations is useful. Probably the quickest way is to use Ceva’s theorem in triangle ACS. You can also use Menelaus’ theorem in ABC, so long as you know a little bit about where the symmedian meets the opposite side.

An alternative is the following. We have a complete quadrilateral here, namely BTCS, and the intersection of all its diagonals. One is A, one is the proposed point of concurrency, and one is the point at infinity, since TB || CS. You can chase that, but I found it more clear to let P be the intersection of ST and BC (which we want to prove lies on AL), then look at the complete quadrilateral ATPB. Then AT and BP meet at C, and AB and TP meet at S. So if we look at where the diagonals of ATPB meet the line CS, we have a harmonic range.

If I’d wanted, I could instead have written the prelude about harmonic ranges, but I had fewer ideas how to set these up in a slick Euclidean way. Also, it feels better to talk about the start, rather than the end of a proof, especially when there were alternative endings. Anyway, a harmonic range is a collection of two pairs of points on a line (A, B; C, D), satisfying the following ratio of directed lengths:

\frac{AC}{BC} = -\frac{AD}{BD}.

A classic example is when D is the point at infinity, the RHS is -1, and so C is the midpoint of AB. Being happy about using the point at infinity is a property of projective geometry, of which this is a good first example. Anyway, returning to the problem, we are looking at where the diagonals of ATPB meet line CS, and this pair of points forms a harmonic range with (C,S). TB meets CS at the point at infinity, and so AP meets CS at the midpoint of CS. But from the symmedian configuration, AL bisects CS, so AP and AL are in fact the same line, and so P lies on AL as required.

I think was a brilliant example of when knowing a bit of theory is enjoyable. It wasn’t at all obvious initially how to use the symmedian property, but then the observation that TB is antiparallel felt like a satisfying breakthrough, but didn’t immediately kill the problem.

Advertisements

Means and Markov’s Inequality

The first time we learn what a mean is, it is probably called an average. The first time we meet it in a maths lesson, it is probably defined as follows: given a list of values, or possibilities, the mean is the sum of all the values divided by the number of such values.

This can be seen as both a probabilistic and a statistical statement. Ideally, these things should not be different, but at a primary school level (and some way beyond), there is a distinction to be drawn between the mean of a set of data values, say the heights of children in the class, and the mean outcome of rolling a dice. The latter is the mean of something random, while the former is the mean of something fixed and determined.

The reason that the same method works for both of these situations is that the distribution for the outcome of rolling a dice is uniform on the set of possible values. Though this is unlikely to be helpful to many, you could think of this as a consequence of the law of large numbers. The latter, performed jointly in all possible values says that you expect to have roughly equal numbers of each value when you take a large number of samples. If we refer to the strong law, this says that in fact we see this effect in the limit as we take increasingly large samples with probability one. Note that it is not trivial to apply LLN jointly to all values for a general continuous random variable. The convergence of sample distribution functions to the cdf of the underlying distribution is the content of the Glivenko-Cantelli Theorem.

In any case, this won’t work when there isn’t this symmetry where all values are equally likely. So in general, we have to define the mean of a discrete random variable as

\mu=\sum k\mathbb{P}(X=k).

In other words, we are taking a sum of values multiplied by probabilities. By taking a suitable limit, a sum weighted by discrete probabilities converges to an integral weighted by a pdf. So this is a definition that will easily generalise.

Anyway, typically the next stage is to discuss the median. In the setting where we can define the mean directly as a sum of values, we must be given some list of values, which we can therefore write in ascending order. It’s then easy to define the median as the middle value in this ordered list. If the number of elements is odd, this is certainly well-defined. If the number is even, it is less clear. A lot of time at school was spent addressing this question, and the generally-agreed answer seemed to be that the mean of the middle two elements would do nicely. We shouldn’t waste any further time addressing this, as we are aiming for the continuous setting, where in general there won’t be discrete gaps between values in the support.

This led onwards to the dreaded box-and-whisker diagrams, which represent the min, lower quartile, median, upper quartile, and max in order. The diagram is structured to draw attention to the central points in the distribution, as these are in many applications of greater interest. The question of how to define the quartiles if the number of data points is not 3 modulo 4 is of exponentially less interest than the question of how to define the median for an even number of values, in my opinion. What is much more interesting is to note that the middle box of such a diagram would be finite for many continuous distributions with infinite support, such as the exponential distribution and the normal distribution.

Note that it is possible to construct any distribution as a function of a U[0,1] distribution by inverting the cdf. The box-and-whisker diagram essentially gives five points in this identification scheme.

Obviously, the ordered list definition fails to work for such distributions. So we need a better definition of median, which generalises. We observe that half the values are greater than the median, and so in a probabilistic setting, we say that the probability of being less than the median is equal to the probability of being greater. So we want to define it implicitly as:

\mathbb{P}(X>M)=\mathbb{P}(X<M).

So for a continuous distribution without atoms,

\mathbb{P}(X>M)=\frac12,

and this uniquely defines M.

The natural question to start asking is how this compares to the mean. In particular, we want to discuss the relative sizes. Any result about the possible relative values of the mean and median can be reversed by considering the negation of the random variable, so we focus on continuous random variables with non-negative support. If nothing else, these are the conditions for lots of data we might be interested in sampling in the ‘real world’.

It’s worth having a couple of questions to clarify what we are interested in. How about: is it possible for the mean to be 1000 times larger than the median; and is it possible for the median to be 1000 times larger than the mean?

The latter is easier to address. If the median is 1000 and the mean is 1, then with probability ½ the random variable X is at least 1000. So these values make a contribution to the mean of at least 500, while the other values make a contribution of at least zero (since we’ve demanded the RV be positive). This is a contradiction.

The former question turns out to be possible. The motivation should come from our box-and-whisker diagram! Once we have fixed the middle box, the median and quartiles are fixed, but we are free to fiddle with the outer regions as much as we like, so by making the max larger and larger, we can increase the mean freely without affecting the median. Perhaps it is clearest to view a discrete example: 1, 2, N. The median will always be 2, so we can increase N as much as desired to get a large mean.

The first answer is in a way more interesting, because it generalises to give a result about the tail of distributions. Viewing the median as the ½-quantile, we are saying that it cannot be too large relative to the mean. Markov’s inequality provides an identical statement about the general quantile. Instead of thinking about the constant a in an a-quantile, we look at values in the support.

Suppose we want a bound on \mathbb{P}(X>a) for some positive a. Then if we define the function f by

f(x)=a \textbf{1}_{\{x\ge a\}},

so f(x)\le x for all values. Hence the mean of f(X) is at most the mean of X. But the mean of f(X) can be calculated as

a\mathbb{P}(X>a),

and so we conclude that

\mathbb{P}(X>a)\leq \frac{\mu}{a},

which is Markov’s Inequality.

It is worth remarking that this is trivially true when a\le \mu, since probabilities are always at most 1 anyway. Even beyond this region, it is generally quite weak. Note that it becomes progressively stronger if the contribution to the mean from terms greater than a is mainly driven by the contribution from terms close to a. So the statement is strong if the random variable has a light tail.

This motivates considering deviations from the mean, rather than the random variable itself. And to lighten the tail, we can square, for example, to consider the square distance from the mean. This version is Chebyshev’s Inequality:

\mathbb{P}(|X-\mu|^2>a\sigma^2)\le \frac{1}{a}.

Applying Markov an exponential function of a random variable is called a Chernoff Bound, and gives in some sense the bound on tails of a distribution obtained in this way.