## Results from the Quiz on Probability of Retweet

Remember the question I posed some time back on finding probability of a tweet with a link being retweeted? The quiz was a fun way to make the audience realize for themselves the futility of any tips they see on improving retweets. 390 people took the quiz and answered it, (40 because I asked, 350 because Avinash Kaushik asked).

Here are the results. Big thanks to SurveyGizmo for its amazing survey platform. The reports pretty much write themselves. You should not even try anything else for running your surveys. For all percentages you see the base is 390 responses.

For the first question I presented the only data that I saw in the report I quote. The answer distribution looks like this

One could say, close to two third are likely to believe and accept whatever is implied by the commentary associated with a partial finding. While 36% asked for more data  only a third of them asked for the right data, the percentage of tweets retweeted, that will help them answer.

After they answered the first question I provided the additional data, percentage of RTs. I provided as optios the correct answer , the two wrong answers from previous question and two bogus answers. The answer distribution looks like this

About one in five found the answer (answer is 16%). It is likely that, even in the presence of additional data,  four out of five people can be convinced to accept a different answer. For example, when a spurious conclusion is presented in the form of a fancy infographic or presented by someone popular. When you see 5000 people tweeted an infographic that talks about scientific ways to improve retweets, it is hard to stop and do the math.

The takeaway is, it is hard for folks to stop and take a critical look at social media findings reported. It is even harder to seek the right additional data and do the math. So most yield to mental shortcuts and answer the easy question.

Note that this 16% number is calculated only to show you what is the average.  But average hides segments. Likely there are multiple segments here. For some, link or not, everyone of their tweet may be retweeted.  The only takeaway is this is a probability calculation and not a recipe and as we collect more data the probability will change.

## Challenging the Certainty in the Claim I often Make

I recently made a comment about a pricing article that recommends doubling prices.

The article under question claims,

Higher prices simply work better. Here’s seven psychological reasons why… (and gives seven reasons)

You may think I make similar claim too. Well, am I on solid ground making a comment like this with bold claim?

If one price is good, two are better.

It states that if it were possible to sell a product profitably at one price, it is certain that there will be higher profit from two prices.  Note that the profit here means Price less marginal cost and does not include fixed cost. You might find there are other cost components that make the second price untenable. But that is a factor you can control.

Some time back I did take a critical look (to an extent I can suppress my own biases) at this claim.

It is impossible for me to re-do years of economic research on consumer surplus, price discrimination and other economic works. The  statement I make relies on those works first started by Pigou.

The point to note is that my claim is not inductive logic. It does not follow from this statement,

“if two prices are good, three are better”

Yet, I did not address adequately the certainty in this claim. Shouldn’t this claim be more like,

“if one price is good, two are likely better for most situations”

No.

Let us rely on the works of  Thomas Bayes for this (P is the probability)

What I am stating with my claim is,

P(2 are better | 1 is good)  = 1 and not

P(2 are better) = 1

That is a huge difference.

The first probability statement is conditional probability. It is the equivalent of stating, “you picked a random card from the deck, if it is Jack of Spade, then we are certain that it is a face card).

It states that if it were possible to sell a product profitably at one price, it is certain that there will be higher profit from two prices. Either a moment’s reflection will convince you or you need to dwell into tomes of economic research.

The second probability statement is false. To see that we have to expand it

P(2 are better) = P(2 are better| 1 is good) . P(1 is good) +
P(2 are better | 1 is not good) . P(1 is not good)

As you can see from practice, P(1 is good) is a much small number than 1 and hence the it is not at all certain “two prices are always good”.

Ignoring all these, the net of these to you the marketer/entrepreneur/product manager is

If you find a market for your product at one price, you will find bigger market (measured in \$\$) at two prices.

## Probability of Another Bear Market – Do Not Listen to Frequentists

By now you already know the market nosedived, plummeted, crashed today – losing 4.5% in just one day. We are down 10% from the peak set in May.

Analysts are quick to call this a correction – by their definition a drop of 10% from peak means we are in correction territory.

Are we going into another bear market ( 20% or more loss from the peak)?

Briniyi Associates has this to say about chances of a bear market

Since 1962, there have been 25 corrections greater than 10% during bull markets. Nine of these instances became bear markets. Historically there is a64% probability that this is only a correction and not the start of a bear market.

The math is not complex to see, in 16 of 25 past corrections we did not enter bear market so they say the chances are 64%.

This is the classic or the frequentist approach that simply counts the instances and assigns a probability to next recession.

But is that correct or relevant?

Estimating the chances of next bear market is not about counting but about estimating the (un)certainty. Will we enter a bear market? Not easy to answer – definitely not by counting the sample space or the outcomes. Probability in this cass ceases to be a ratio of two countable events and becomes a representation of hunch, degree of belief, gut feel or a hypothesis.

Again, is a bear market just around the corner? We don’t know and neither do those analysts with their incorrect use of probability.

## The A/B Test Is Inconclusive. Now What?

So you just did a A/B test between the old version A and a proposed new version B. Your results from 200 observations show version A received 90 and version B received 110. Data analysis says there is no statistically significant difference between the two versions. But you were convinced that version B is better (because of its UI design and your prior knowledge etc.).  So should you give up and not roll out version B? [tweetmeme source=”pricingright”]

With A/B testing, it is not enough to find that in your limited sampling, Version B performed better than Version A. The difference between the two has to be statistically significant at a preset confidence level . (See Hypothesis testing.)

It is not all for naught if you do not find statistically significant difference between your Version-A and Version-B. You can still move your analysis forward with some help from this 19th century pastor from England – Thomas Bayes.

Note: What follows below is a bit heavy on use of conditional probabilities. But if you hung in there, it is well worth it so you do not throw away a profitable version! You could move from being 60% certain to 90% certain that version B is the way to go!

Before I get to that let us start with statistics that form the basis of A/B test.

With A/B testing you are using Chi-square test to see if the observed frequency  difference between the two versions are statistically significant. A more detailed and an easy to read explanation can be found here.   You are in fact starting with two hypotheses:

The Null hypothesis H0: The difference between the versions are just random
Alternative hypothesis H1: The difference is significant such that one version performs better than the other.

You also choose (arbitrarily) a confidence level or p-value at which you want the results to be significant. The most common is 95% level or p=0.05. Based on your computed Chi-square value for that p value (lesser than corresponding value for 1 degree of freedom or greater ) you retain H0 or reject H0 and accept H1.

A/B test results are inconclusive. Now What?
Let us return to the original problem we started with. Your analysis does not have to end just because of this result, you can move it forward by incorporating your prior knowledge and with help from Thomas Bayes. Bayes theorem lets you find the likelihood your hypothesis will be true in the future given the observed data.
Suppose you were 6o% confident that version B will perform better. To repeat, we are not saying Version B will get 60%, we are stating that your prior knowledge says you are 60% certain version B should perform better (i.e., the difference is statistically significant). That represents your prior.

Then with your previously computed Chi-square value, instead of testing whether or not it is significant at p value 0.05, find for what p value the Chi-square value is significant and compute 1-p (Smith and Fletcher).

In the example I used, p is 0.15 and 1-p is 0.85.  According to Bayes, this is the likelihood that data fit the hypothesis given the hypothesis.

Then the likelihood your hypothesis will be correct in the future (posterior) is 90%
(.60 * .85)/(.40*.85+.3*.15)

From being 60% certain  that version B is better you have moved to being 90% certain that version B is the way to go. You can now decide to go with version B despite inconclusive A/B test.

If the use of  prior “60% certain that the difference is statistically significant” sounds like subjective, it is. That is why we are not stopping there and improving it with testing. It would help you to read my other article on hypothesis testing to understand that there is subjective information in both classical and Bayesian statistics. While with AB test we treat the probability of hypothesis ( that we set as) 1, in Bayesian we assign it a probability.

For the analytically inclined, here are the details behind this.

With your A/B testing you are assuming the hypothesis as given and finding the probability the data fits the hypothesis. That is conditional probability.

If  P(B) is the probability that version B performs better then P(B|H) is the conditional probability that B occurs given H. With Bayesian statistics you do not do hypothesis testing. You are find the conditional probability that given the data which hypothesis makes sense.  In this case it is P(H|B). This is what you are interested in to decide whether to go with version B or not

Bayes theorem says

P(H|B)  =  (Likelihood  *  Prior )/ P(B)
Likelihood = P(B|H) what we computed as 1 -p  (See Smith and Fltecher)
Prior = P(H)  – your prior knowledge, the 60% certain we used
P(B) = P(B|H)*P(H)+ P(B|H-))*(1-P(H))
P(B|H-) is the probability of B given hypothesis is false. In this model  it is p since we are using P(B|H) = 1-p

This is a hybrid approach, using classical statistics (the p value you found the  A/B test to be significant) and Bayesian statistics. This is a simplified version than just Bayesian statistics which is computationally intensive and too much for the task at hand for you.

The net is you are taking the A/B test a step forward despite its inconclusive results and are able to choose the version that is more likely to succeed.

What is that information worth to you?

References and Case Studies:

1. Bayesian hypothesis testing for psychologists: A tutorial on the Savage–Dickey method. By: Wagenmakers, Eric-Jan; Lodewyckx, Tom; Kuriyal, Himanshu; Grasman, Raoul. Cognitive Psychology, May2010, Vol. 60 Issue 3, p158-189
2. The Art & Science of Interpreting Market Research Evidence – Hardcover (May 5, 2004) by D. V. L. Smith and J. H. Fletcher
3. Lilford R.  (2003) Reconciling the quantitative and qualitative traditions: the Bayesian approach . Public Money and Management vol 23, 5, pp 2730284
4. Retzer J (2006) The century of Bayes. I J of Market Research vol 48, issue 1, pp 49-59