How to take a Passport Photograph – Part 1

As seems to be case whenever you start somewhere new, I’ve needed an almost infinite supply of passport-sized photographs recently. The university, my college, my department and of course the Chinese immigration authorities all wanted a record of my beautiful features. Anyway, as a result of all of this interest, I was in the do-it-yourself photo booth WHSmiths in Wandsworth getting some more the other day. The first attempt looked fine, but the machine offered me the possibility of trying again, up to twice if I wanted. This seemed like a win-win situation, so I said yes, not realising that the one I already had would not be kept ‘in the bag’. The second attempt looked somewhat startled, a pose that runs in my family, but not wanting to risk the possibility of a disastrous third attempt (and the financial penalty of having to do the whole operation again) I confirmed that I was happy and made do with the result. Naturally, the question that struck me: what is the optimal strategy for such a situation? (Assuming that, unlike me, you knew the rules from the beginning)

Mathematical model and choices

Let’s formulate this mathematically. Suppose there are n possible trials, corresponding to iid random variables X_1,\ldots,X_n. (Note that this assumes that your ‘performance’ does not improve or otherwise change during the process. Perhaps not a reasonable assumption in some contexts?) After trials X_1,\ldots,X_k have been observed, you have to choose whether to accept the value X_k as your ‘final answer’, or whether to continue.

The first key decision is: what distribution should the X_is have? Since in the original problem there isn’t a natural metric for quality, let’s assume that the X_is represent some well-defined quantitative utility, distributed as a uniform [0,1] random variable. Perhaps a normal random variable might be a more realistic model, but I can solve it in this case, so let’s stick to this for now. In addition, for the sake of making the eventual answer more simple, let’s say that 0 is the best quality and 1 is the worst. That is, we are looking for a strategy that stops the process at a time T so as to minimise \mathbb{E}X_T.

Finding an optimal strategy

The key observation is the following. In words, if we reject X_1, we can forget about its value as that is independent of \{X_2,\ldots,X_n\} which is now all that remains to base future judgments on. We return to the original problem with one fewer trial. In more mathematical notation, conditional on \{T\neq 1\}, T is independent of X_1.

The following argument assumes that an optimal strategy exists, which is not ideal, but can easily be justified. For now though, we proceed relatively informally by induction on n.

Let S be the stopping time for the optimal strategy on X_2,\ldots,X_n which we assume exists by induction. It is ‘obvious’ that the optimal strategy T for X_1,\ldots,X_n should be the following:

  • T=1 iff X_1<a(n), where this is a deterministic quality with dependence only on n.
  • Conditional on T\neq 1, take T=S.

From this alone, we can calculate \mathbb{E}X_T.

\mathbb{E}X_T=\mathbb{E}X_S1_{T=S}+\mathbb{E}X_T1_{T\neq S}

= (1-\mathbb{P}(T=1))\mathbb{E}X_S+\mathbb{E}X_1 1_{X_1<a(n)}

= (1-a(n))\mathbb{E}X_S +\frac12 a(n)^2.

This is minimised precisely when a(n)=\mathbb{E}X_s. We conclude that the optimal strategy, as of course we might well expect, is to take T=1 precisely if X_1 is less than the expected result of applying the optimal strategy to the remaining random variables.

By extension, we have a(n+1)=\mathbb{E}X_T, and so

a(n+1)=a(n)\left[1-\frac{a(n)}{2}\right]. (*)

The first few values are:

a(1)=1,\quad a(2)=\frac12,\quad a(3)=\frac38,\quad a(4)=\frac{39}{128},\quad\ldots

Behaviour of a(n)

The first question is: as n grows large, does a(n)\rightarrow 0? Well this isn’t too hard: the recursive definition (*) confirms that the sequence $a(1),a(2),\ldots$ is (strictly) decreasing, and so has a limit, which must be a fixed point of the equation (*). The only such fixed point is 0.

The second question is: what is the asymptotic behaviour of a(n) for large n? A quick run on MATLAB, or examination of the equation (*) suggests that

a(n)\approx \frac{2}{n}

should describe the behaviour well for large n. My basic attempts to verify this were initially unsuccessful, but I felt fairly sure that this should be true in some metric sense because of the following highly non-rigorous but nonetheless convincing idea.

Claim: a_n=\frac{2}{n}+O(n^{-3}) satisfies (*).

Why? Well, then:

\frac{2}{n+1}-\left[(\frac{2}{n}+O(n^{-3}))-\frac12(\frac{2}{n}+O(n^{-3}))^2\right]

=\frac{2}{n+1}-\frac{2}{n}+\frac{2}{n^2}+O(n^{-3})

=\frac{2}{n^2(n+1)}+O(n^{-3})=O(n^{-3}).

This proves the claim, but none of the = signs are really especially meaningful here. Perhaps there is a really slick way to tie this up that I’ve missed? In any case, I will save my own slightly involved method for a new post.

Advertisement