This isn’t original, I have read this somewhere nevertheless this serves to explain estimations, confidence interval and precision.

Suppose if I asked you to estimate how much cash does Brad Pitt carries in his wallet what would be your guess?

It is hard to guess it right. It is hard because I asked you to give a single number and given that there are many possibilities (even with just whole numbers) your answer is likely going to be wrong. With you your guess of single value you cannot tell how confident you are about the estimate. That is the problem with making single value estimate – be it estimating cash in a wallet or expected revenue impact of a marketing campaign. Don’t give a single number and don’t trust anyone giving you a single number.

What if we asked 1000 random people on the street to find all their answer and averaged it out. Would that give the right answer? Isn’t that wisdom of the crowd? Well it won’t be the right answer. But if you plot the answers and number of people who said each value on a graph (Histogram) you likely will see a Normal curve. The distribution will tell us the low and high value of the cash in Brad Pitt’s wallet and also the chance that it will be outside this range.

Suppose 95% of the responses fall between $10 and $978 then we could say, “we are 95% confident Brad has $10 to $978”. Well we could be wrong in saying “we are 95% confident” if we received homogenous answers and hence got too narrow a range.

Instead of asking 1000 people what if I asked you not for a single value but to give your 95% confidence interval for the amount of cash in his wallet, what would be your answer?

It is the equivalent of asking 1000 different people. And since I asked for 95% confidence you should give a range so that there is only 5% chance the real answer is outside this range. I am not asking for precision (so don’t try to give a narrow range) but a high confidence level (so you should go wide).

Since you do not know anything about cash carrying habits of Hollywood stars you should trade-off precision for confidence. You could answer, $0 and $100,000. That is acceptable but too wide a range to be of real use in cases other than estimating Brad Pitt’s wallet. However you can apply your knowledge about stuffing bills in a wallet and give a better range like $10 and $2000.

That is what you would do when measuring outcomes of events when there are many unknowns. You break down the BIG unknown into a set of component unknowns and for each smaller unknown you make an estimate at a given confidence level. Stating a range with confidence level (confidence interval) based on application of prior knowledge is far better and usable than a single number that we are asked to trust.

“The marketing campaign will increase sales by 45%”

“We are 90% confident the marketing campaign will result in sales increase in the range of 20% to 43%”

Which one of these two is more trustworthy?