Jack Reacher the Bayesian

Frequentist is one thinks of probability as countable – that is count the number of all possibilities and find the probability of specific event as fraction of those possibilities. Like 50% for heads in a coin flip. Talking of coin flip I came across this passage in a Jack Reacher novel by Lee Child,

      First time: heads or tails? Fifty-fifty, obviously.
      So, second time: heads or tails? Exactly fifty-fifty again. And the third time, and the fourth time. Each flip was a separate event all its own, with identical odds, statistically independent of anything that came before. Always fifty-fifty, every single time.

Then the hero Jack Reacher computes probability of four heads in a row, a simple multiplication of 1/2 X 1/2 X 1/2 X 1/2 as he correctly states that each coin flip is an independent event. Our hero  then goes on to calculate probabilities of events that matter to him,

    And Reacher needed four heads in a row. As in: Would Susan Turner get a new lawyer that afternoon? Answer: either yes or no. Fifty-fifty. Like heads or tails, like flipping a coin. Then: Would that new lawyer be a white male? Answer: either yes or no. Fifty-fifty. And then: Would first Major Sullivan or subsequently Captain Edmonds be in the building at the same time as Susan Turner’s new lawyer? Assuming she got one? Answer: either yes or no. Fifty-fifty. And finally: Would all three lawyers have come in through the same gate as each other? Answer: either yes or no. Fifty-fifty.

Now we have a few problems. Can you spot them?

First, the fact that there are only two outcomes should not be confused with their chances of occurring. For example if you bought  PowerBall for tonight’s drawing, the next day you will either wake up as millionaire or not. So is your chance of winning 50-50? Same reasoning for the four outcomes Reacher is worried about – they are not 50-50 just because each has a yes-no answer.

Second problem is not all four of the events Reacher is concerned about can be modeled as countable events. For the first, third and fourth events described above the definition of probability is a measure of uncertainty and should not be modeled as countable events.

For the second outcome, “Would that new lawyer be a white male?” – it sure can be modeled as countable rather than as a measure of uncertainty but it is still not 50-50.

But Lee Child, the author, is likely deliberately misleading readers with this. Starting with 50-50 in the absence of no prior information and given only two outcomes is actually not a bad start. That is if you understand this is uninformed estimate and make an effort to refine the estimate with new information.

That is you are moving from being a frequentist to Bayesian. And Jack Reacher does seem a Bayesian,

Statistics were cold and indifferent. Which the real world wasn’t, necessarily. The army was an imperfect institution. Even in noncombatant roles like the JAG Corps, it wasn’t perfectly gender-neutral, for instance. Senior ranks favoured men. And a senior rank would be seen as necessary, for the defence of an MP major on a corruption charge. Therefore the gender of Susan Turner’s new lawyer wasn’t exactly a fifty-fifty proposition. Probably closer to seventy-thirty, in the desired direction. Moorcroft had been male, after all. And white. Black people were well represented in the military, but in no greater proportion than the population as a whole, which was about one in eight. About eighty-seven to thirteen, right there.

And so on and so forth Reacher goes on to refine the probabilities of four outcomes with new information like a Bayesian would do.

I am impressed.

Do you know your p(A), p(B) and p(C)s?