Estimating Wrong and Estimating the Wrong Metric

In business, large enterprise or startup, we make many estimates in our roles. Be it market size, addressable market,  penetration, market share growth, effect of a marketing campaign etc.

In the absence of clear data we make assumptions, look at past performance, compare to similar entities and try to estimate the metric we are interested in. In this process we make process errors – errors in spreadsheet, coding, data entry – errors that are avoidable with stricter framework or can be weeded out with a review process.

What is more dangerous are the bias driven errors that cannot be caught by any review process because we all share the same view and biases.

Here are those estimation errors that we are oblivious to. For a vivid illustration of these errors see  this story on museum visits.

Museums take pains to make the past come alive, but many are out of their depth when looking into the future.

When selling plans for a new building or a blockbuster exhibit, civic leaders and museum officials typically cite projected visitor counts. These numbers can be effective in securing funding and public support. As predictions, however, they often are less successful.

  1. Estimating Exact Numbers – Estimating a single exact number with certainty because we look down on those who don’t do that as diffident.
  2. Comparison Error – Making estimates based on other similar projects but err by choosing the most successful ones (not to mention what is available and recent).
  3. Atypical Scenarios - Making estimates not based on what is most common and most likely to happen but on atypical scenarios.
  4. Context Error – Ignoring the state of your business (stage in growth cycle etc), market, dynamics, customer segment etc.
  5. Attribution Error – Attributing someone’s success to wrong traits and making our own estimates because of shared traits.
  6. Using Lore – Using very large number from the past because it is part of the organization’s shared lore – never stopping to question, “Where did we get that number?”
  7. Ignoring Externalities – Making estimates without regard to economic factors, disruptions, what the competitors would do
  8. Underestimation –  Deliberately estimating low in order to exceed expectations

 

Finally to top all these errors, the biggest of them all is to estimate the completely irrelevant and wrong metric because it is easy —  page views, number of retweets, user base etc. So even if you fix all the above errors you end up with perfectly estimated wrong metric. I have written in detail about this here.

Do you know your estimation errors?

Big Data predicts people who are promoted often quit anyway – But …

I saw this study from HP that used analytics (okay Big Data analytics, whatever that means here) to predict employee attrition.

HP data scientists believed a companywide implementation of the system could deliver $300 million in potential savings “related to attrition replacement and productivity,

I must say that unlike most data dredging that goes on with selective reporting these data scientists started with a clear goal in mind and a decision to change before diving into data analysis. It is not the usual

“Storage and compute are cheap. Why throw away any data? Why make a decision of what is important and what is not? Why settle for sampling when you can analyze them all? Let us throw in Hadoop and we will find something interesting”

Their work found,

Those employees who had been promoted more times were more likely to quit, unless a more significant pay hike had gone along with the promotion

The problem? Well this is not the hypothesis they developed independent of data and then collected data to test this. That is the prescribed approach to hypothesis driven data analysis. Even with that method one cannot stop when data fits the hypothesis because data can fit any number of plausible hypotheses.

The problem is magnified with Big Data where even tiny correlations get reported because of sheer volume of data.

What does it mean that people who are promoted often quit?

Is it the frequent promotion that is the culprit? Isn’t it likely that those who are driven and high value-add more likely to get promoted often,  likely to want to take on new challenges and also look attractive to other companies?

The study adds, “unless associated with a more significant pay hike”.

Isn’t it more likely that either the company is simply using titles to keep disgruntled employees happy or just making up titles to hold on to high performance employees without really paying them for the value add? In either case, aren’t the employees more like to leave after few namesake promotions that really don’t mean anything?

Let us look at the flip side. Why are people who are not promoted frequently end up staying?  Why do companies give big raises to keep those who were promoted?

Will stopping frequent promotion stop the attrition? Or will frequent promotion with big pay raises stop it? Neither will have an effect.

The study and the analysis fail to ask
Is the business better off paying big raises to keep those who are frequently promoted than letting them leave?

Is the business better off if those who are not promoted often choose to stay?

That is the problem with this study and with Big Data analytics that do not start with a hypothesis developed outside of the very data they use to prove it. It finds the first interesting correlation, “frequent promotions associated with attrition” and declares predictability without getting into alternate explanations and root cause.

Big Data does not mean suspension of mind and eradication of theory. The right flow remains

  1. What is the decision you are trying to change?
  2. What are the hypotheses about drivers- developed by application of mind and prior knowledge?
  3. What is the right data to test?
  4. When data fits hypothesis could there be alternate hypotheses that could explain the data?
  5. How does the hypotheses change when new data comes in?

How do you put Big Data to work?

Pig or a Dog – Which is Smarter?: Metric, Data, Analytics and Errors

How do you determine which interview candidate to hire? How do you evaluate the candidate you decided you want to hire? (or decided you want to flush?)

How do you make a call on which group is performing better? How do you hold accountable (or explain away) bad performance in a quarter for one group vs. other?

How do you determine future revenue potential of a target company you decided you want to acquire? (or decided you don’t want to acquire)?

What metrics do you use? What data do you collect? And how do you analyze that to make a call?

Here is a summary of an episode from Fetch With Ruff Rufman, PBSKids TV show:

http://pbskids.org/fetch/i/home/fetchlogo_ruff_vm.gifRuff’s peer, Blossom the cat, informs him pigs are smarter than dogs. Not believing her and determined to prove her wrong, Ruff sends two very smart kids to test. The two kids go to a farm with a dog and a pig. They decide that time taken to traverse a maze as the metric they will use to determine who is smarter. They design three different mazes

  1. A real simple straight line  (very good choice as this will serve as baseline)
  2. A maze with turn but no dead-ends (increasing complexity)
  3. A maze with two dead-ends

Then they run three experiments, letting the animals traverse the maze one at a time and measuring the time for each run. The dog comes out ahead taking less than ten seconds in each case while the pig consistently takes more than a minute.

Let me interrupt here to say that kids did not really want Ruff to win the argument. But the data seemed to show otherwise. So one of the kid changes the definition on the fly.

“May be we should re-run the third maze experiment. If the pig remembered the dead-ends and avoids them then it will show the pig is smarter because the pig is learning”

And they do. The dog takes ~7 seconds compared to 5.6 seconds it took in the first run. The pig does it in half the time, 35 seconds, as its previous run.

They write up their results. The dog’s performance worsened while pig’s improved. So the pig clearly showed learning and the dog didn’t. The pig indeed was smarter.

We are not here to critique the kids. This is not about them. This is about us, leaders, managers and marketers who have to make such calls in our jobs. The errors we make are not that different from the ones we see in the Pigs vs. Dogs study.

Are we even aware we are making such errors? Here are five errors to watch out for in our decision making:

  1. Preconceived notion: There is a difference between a hypothesis you want to test vs. proving a preconceived notion. 

    A hypothesis is, ” Dogs are smarter than pigs”.  So is, “The social media campaign helped increase sales”. 

    A preconceived notion is, “Let us prove dogs are smarter than pigs”. So is, “let us prove that the viral video of man on horse helped increase sales”. 

  2. Using right metric:  What defines success and what better means must be defined in advance and should be relevant to the hypothesis you are testing.
    Time to traverse maze is a good metric but is that the right one to determine which animal is smart? Whether smart or not dogs have an advantage over pigs – they respond to trainer’s call and move in that direction. Pigs only respond to presence of food. That seems unfair already.
    Measuring presence of a candidate may be a good but is that the right metric for the position you are hiring for? Measuring number of views on your viral video is good but is that relevant to performance?
    It is usually bad choice to pick a single metric. You need a basket of metrics that taken together point to which option is better.
  3. Data collection: Are you collecting all the relevant data vs. collecting what is convenient and available?  If you want to prove Madagasar is San Diego then you will only look for white sandy beaches. If you stop after finding a single data point that fits your preconceived notion you will end taking $9B write down on that acquisition.
    Was it enough to test one dog and one pig to make general claim about dogs and pigs?
    Was one run of each experiment enough to provide relevant data?
  4. Changing definitions midstream: Once you decide on the hypothesis to test, metrics and experimental procedure you should stick to that for the scope of the study and not change it when it appears the results won’t go your way.
    There is nothing wrong in changing definition but you have to start over and be consistent.
  5. Analytics errors: Can you make sweeping conclusions about performance without regard to variations?
    Did the dog really worsen or the pig really improve or was it simply regression to the mean?
    Does 49ers backup quarterback really have hot-hand that justifies benching Alex Smith?What you see as sales jump from your social media campaign could easily be due to usual variations in sales performance. Did you measure whether the performance uplift is beyond the usual variations by measuring against a comparable baseline?

How do you make decisions? How do you define your metrics, collect data and do your analysis?

Note: It appears from a different controlled experiment that pigs are indeed smarter. But if they are indeed so smart how did they end up as lunch?

Apple Playing High Risk Game with iPad mini – Monte Carlo Analysis

It appears iPad mini (or whatever branding Apple comes up with for their 7″ tablet) is real. Apple did announce an event for October 23rd which is highly likely the iPad mini event.

I have written about the profit impact of the iPad mini and so did many others. (See my longer piece at GigaOm.)
Many take the approach

  1. Apple will sell 10s of millions of iPad mini before Holidays
  2. iPad mini is a market share game
  3. There will be cannibalization but it is better to self-cannibalize
  4. There will be so much new volume from lower price point of iPad mini that Apple will capture marker share
  5. iPod Touch is a different product category and it will not be impacted by iPad Mini

My question has been centered around whether or not the new device will deliver incremental (net new). No one has done some real analysis to show what the impact is. Even my article stopped short of exact numbers. Articles by others (of course) are even worse, they expect us to believe on faith that Apple will do well with iPad mini.

Now there is some real answer, based on more rigorous analysis than just claims that self-cannibalization is better.

My analysis, using statistical modeling, shows Apple may end up selling 22-52 million iPad minis but is placing a high risk bet when it comes to profit. Let us start from the beginning.

As I did before for Pinterest revenue model I chose to do Monte Carlo analysis to find impact on Apple’s profits from iPad mini. This is a reliable tool to use when there are many variables and there is uncertainty in the result. It also helps to state the result as a probability distribution instead of absolute statements we see from some of the analysts.

The model starts with listing the different variables that feed into final result and their 90% confidence interval values. That is we list all the different variables and state the low and high values that we are 90% confident about (we are 90% confident the real value is between low and high and only 10% chance the real value is outside this range).

I am going to assume contribution margin from iPad is $225 (given its 40%-50% margin numbers stated by iSuppli and others). All volume numbers are for the full year. The trade-down numbers and the “steal” numbers come from a recent market research on iPad mini preference. Steal here means how many of current nook/Kindle/nexus customers will switch to iPad mini. New sales is the number of new customers entering the market because of iPad mini. Current iPad volume numbers are based on Apple’s past four earnings reports.

It is easy to see that

Total iPad mini sales = Trade-down volume + Steal +New sales

Profit from iPad mini = iPad Mini margin X Total iPad mini sales

Lost profit from Trade-down = iPad Margin X Trade-down volume

Net new profit = Profit from iPad mini  – Lost profit from Trade-down

Note that I ignored the effect on other products both iPod Touch and iPhone.

Running the model for 1600 iterations yields some stunning results.

First the total iPad mini volume numbers. These are huge. It is almost certain that Apple will sell at least 14 million units per year. There is 95% probability that they will sell somewhere between 22 million and 52 million iPad mini.  And considering all possible scenarios the expected volume is 35 million units. These kind of numbers blow out the ramp up curves we have seen with any of the electronics products.

Such numbers will bring smile to those who chase market share and will delight analysts who recommend chasing market shares. But what does that do to Apple’s profit?

Here is the big surprise. Despite huge volumes, profit estimates show Apple is playing a high risk game with iPad mini.

First there is a 47% chance Apple will lose money (not including fixed costs, just the marginal costs, so the real impact can be worse).

At its worst, there is 1% chance that Apple could see $2.2B drop in its gross profit. It does not get much better, there is 15% chance Apple could see $1 B drop in its profit.

At the other end there is only 1% chance they could make $2.3B additional profit and only 13% chance they could see $1B additional profit.

Considering all possible scenarios, the expected net new profit from iPad mini is just $97 million a year.

That is not a big enough considering other R&D and marketing expenses (fixed costs).

There you have it. Apple will likely sell 34 million units in the first year but runs the risk of seeing no impact or worse significant impact on its profit.

Analysts betting on Apple stocks, thinking iPad mini will a few dollars to their EPS, take note. iPad mini is a high risk game for Apple despite assured high volumes.

Looking for falsifying evidence

Here is a puzzle I saw in Gruber’s flash card for elementary school children.

More people who smoke will develop lung cancer than those who do not smoke.

What research will show smoking does not cause lung cancer?

This is not an argument about smoking, Big Tobacco, or morals. I like this question because it is simple, popular and familiar to most of us. The first statement makes us draw the most obvious conclusion – smoking causes lung cancer. Likely we wont look past this statement. And that is what makes the question very interesting.

The questions are, given all our knowledge and pre-conceived notion (so to speak), if you were asked to falsify the causation claim,

  1. What research you will do?
  2. What data you will seek?

This twist makes us stop, ignore our System 1 (Kahneman) and think. Finding one more example to support the claim is not difficult. Finding falsifying evidence is not only difficult but requires a different thought process.

You see numerous such causation claims in pulp-non-fiction business books (7-Habits, In Search of Excellence, Good to Great, Linchpin, Purple Cow) and blogs. Mom and apple-pie advice about startup, running a business, marketing etc. bombard us everyday in twitter. Our System 1 wants us to accept these. After all these are said by someone popular and/or in power and the advice is so appealing and familiar.

Penn Jillette of Penn and Teller wrote,

“Magic is unwilling suspension of disbelief”

For example the audience cannot ask to walk up the stage to look inside boxes. They have to accept the magician’s word for it. That is unwilling suspension of disbelief. When it comes to gross generalizations and theories supported only by the very data that is used to form them (e.g., What can we learn from xyz) we don’t have to suspend disbelief. We have the will to seek the evidence that will falsify the claim.

Do we stop and look for falsifying evidence or find solace in the comfort of such clichéd advice?

By the way, the answer to the Gruber puzzle is in looking for lurking variable. And there is none.

What Percentage of US Teachers are Millionaires?

I saw a Statistic that said 14% of the millionaires in US are teachers, second only to managers and ahead of lawyers and doctors.

Given this and what you know about teachers, what percentage of teachers do you think are millionaires?

About same? Higher? Lower? Write down the first thought that comes to your mind.

My preconceived notion, based on what I can selectively recall, is there are far fewer millionaires among teachers compared to rest of population. In other words, while 14% of the millionaires are teachers, far fewer proportion of teachers are millionaires.

Until I did the math.

P( Teacher | Millionaire) = P( T | M ) = 0.14

What we want is given someone is a teacher, how likely is that teacher a millionaire, that is  what is P(M | T).

P(M | T) = P(T | M)  X P(M) / P(T)

To compute this we need

  1. What proportion of US population are teachers , P(T)
  2. What proportion of US population are millionaires, P(M)

Data from US Census says, there are 7.2 million teachers and 8.6 million people are millionaires.

That means P (M | T) is 16.8%, or one in six teacher is a millionaire.

That does correct the stereotype I had about teachers.

teacher and its importance

Does this mean if you became a teacher (and you are not already a millionaire)  then your chances of becoming a millionaire is 16.7%?

Unfortunately not.

46% of the millionaire teachers reported they got there through inheritance.  A good portion reported having second income through working spouse. Few more became teachers after becoming millionaires because they had the financial flexibility to do whatever they liked.  Taking these numbers out, that means only 4 to 8% of the teachers became millionaires because of the profession.

That is still a high number and includes a broad class of teachers including college professors.

So there you have it. One in six teachers in US are millionaires but if you want to become a millionaire by choosing teaching profession your chances are less than one in twenty.