Most of my works on pricing and consumer behavior studies rely on hypothesis testing. Be it finding difference in means between two groups, non-parametric test or making a causation claim, explicitly or implicitly I apply hypothesis testing. I make overarching claims about customer willingness to pay and what factors influence it based on hypothesis testing. The same is true for the most popular topic, these days, for anyone with a web page – AB split testing. Nothing wrong with these methods and I bet I will continue to use these methods in all my other works.
We should note however that the use of hypothesis and finding statistically significant difference should not blind us to the fact that there is some amount of subjectivity that go into all these. Another important distinction to note is, despite the name hypothesis testing we are not testing whether the hypothesis is validated but whether the data fits the hypothesis which we take it as given. More on this below.
All these testings proceed as follows:
- Start with the hypothesis. In fact you always start with two, the null hypothesis which is the same for any statistical testing
The Null hypothesis H0: The observed difference between subjects (or groups) is just due to randomness.
Then you write down the hypothesis that you want to make a call on.
Alternative hypothesis H1: The observed difference between subjects (or groups) is indeed due to one or more treatment factors that you control for. - Pick the statistical test you want to use among those available given your case. Be it a non-parametric test like Chi-square that makes no assumption about the distribution of data (AB testing) or parametric test like t-test that assumes Gaussian distribution (e.g., normal) of data.
- Select a critical value or confidence level for the test 90%,95%, 99% with 95% being the most common. This is completely subjective. What you are stating with the critical value is the results are statistically significant only if these can be caused due to randomness in less than 5% (100-95%) of the cases. The critical value is also expressed as p value ( probability ), in this case 0.05.
- Perform the test with random sampling. This needs more explanation but is beyond the scope of what I want to cover here.
As you can see, we the analyst/decision maker make up the hypothesis and we are treating the hypothesis as given. We did the right thing of writing it first. ( A common mistake in many of the AB tests and in data mining exercises is writing the hypothesis after the test.)
What we are testing is, given this hypothesis H1 is true (P(H1)=1) what is the probability the test data D fits the hypothesis.
This is expressed as P(D|H1). Statistical significance here means P(D|H1) > 0.95 given P(H1) =1.
When we say we accept H1, we are really saying H0 (randomness) cannot be the reason and hence H1 must be true. We rule out the fact that the observed data can be explained by any number of alternative hypotheses. Since we wrote the original hypothesis, if we did not base it on proper qualitative analysis then we could be wrong despite the fact our tests yields statistically significant results.
This is why you should never launch a survey without doing focus groups and customer interviews. This is why you don’t jump into statistical testing before understanding enough about the subjects under study to frame relevant hypothesis. Otherwise you are, as some wrote to me, using gut feel or pulling things out of thin air and accepting it simply because there is not enough evidence in the data to overturn the null hypothesis.
How do you come up with your hypotheses?
Look for my next article on how this is different in Bayesian statistics.
Pingback: Can you add my one question to your survey? | Iterative Path
Pingback: My First GitHub Commit – Select Random Tweets | Iterative Path
Pingback: 4 Ways You Can Put Google Customer Surveys To Work Today | Iterative Path
Pingback: Google Customer Surveys – True Business Model Innovation, But | Iterative Path
Pingback: The Real Limitations of Evidence Based Marketing | Iterative Path
Pingback: Let us hunt for something interesting in this data gold mine « Iterative Path
Pingback: Hypothesis Testing, Anecdotes and Updating Prior Knowledge « Iterative Path
Pingback: 8 Flaws in A/B Split Testing | Iterative Path
Pingback: Simoleon Sense » Blog Archive » Weekly Roundup: #79 -The Smartest Linkfest On The Web
Pingback: Tweets that mention Who Makes the Hypothesis in a Hypothesis Testing? | Iterative Path -- Topsy.com
Pingback: The A/B Test Is Inconclusive. Now What? | Iterative Path