If you cared to run the numbers – Looking beyond the beauty of Infographics

I debated whether or not to write this article. There is really no point in writing articles that point out flaws in some popular piece. Neither the authors of those posts nor the audience care. For those who care, they already understand the math and this  article adds no incremental value.

But the case in point is so egregious that it serves as a poster boy for the need for running the numbers, to test BIG claims for their veracity, and look beyond the glossy eye candies.

This one comes from VentureBeat and has a very catchy title that made 2125 people to Like it on Facebook. All of them likely just read the title and are satisfied with it or saw the colorful infographic and believed the claim without bothering to check for themselves. There is also the comfort in knowing that they are not alone in the Likes.

You can’t expect the general population to do some critical thinking or any analysis given the general lack of statistical skills and their cognitive laziness. It is the System-1 at work with a lazy System-2 (surely you bought Kahneman’s new book).

You would think the author of the article should have checked, but the poor fellow is likely a designer who can do eye-popping  infographics and cannot run tests for statistical significance. He is likely an expert in stating whether using rounded corners with certain shading is better #UX or not.

The catchy title and the subject also don’t help.

So almost everyone accept the claim for what it is.  But is there one bit of truth in VentureBeat’s claim?

Let us run the numbers here.

Without further ado, here is the title of the article that 2125 facebook people Liked.

Women who play online games have more sex (Infographic)

How did they arrive at the claim? They looked at data collected by Harris Interactive which surveyed over 2000 adults across US. Since the survey found 57% female gamers reported having sex vs. 52% female non-gamers, it makes the bold claim in its title. Here is a picture to support the claim.

The claim supported by the beautiful picture sounds plausible?

How would you verify whether the difference is not statistical noise?

You would run a simple crosstab (chi-square test)- and there are online tools that makes this step easier. What does this mean? You will test whether the difference between the number of female gamers reported having sex and female non-gamers reporting the same is statistically significant.

The first step is to work with absolute numbers not percentages. We need numbers that 57% and 52% correspond to. For this we need number of females surveyed and what percentage are gamers and non-gamers.

The VentureBeat infographic says, “over 2000 adults surveyed”. The exact number happens to be 2132.

Let us find the number of gamers among females. The article says, of the gamers – 55% are females and 45% are males. This is not same as 55% of females are gamers. Interestingly they never reveal to us what percentage of the surveyed people are gamers. So we resort to data from other sources. One such source (circa 2008) says, 42% of population play games online. We can assume that the number is now 50%.

So the number of gamers and non-gamers is 1066 each. Then we can say (using data from the infographic)

Number of female gamers = 55% of 1066 = 587
Number of female non-gamers = ?? (it is not 1066-587)

The survey does not say number of males vs. female, but we can assume it is split evenly. If you want to be exact you can use the ratio from census.gov  which states 50.9% female to 49.1% male). So there are likely 1089 females surveyed.

That makes number of female non-gamers = 1089 – 587 = 502

The next step is find number of women reported having sex (easy to do from their graph)

Number of female gamers reported having sex = 57% of 587 = 335 (not having sex = 587-335 = 252)

Number of female non-gamers reported having sex = 52% of 502   = 261 (not = 241)

Now you are ready to build the 2X2 contingency table

Then you run the chi-square test to see if the difference between the numbers is statistically significant.

H0 (null hypothesis): The difference is just random

H1 (alternative hypothesis): The difference is not just random, more female gamers do have sex than female non-gamers.

You use the online tool and it does the work for you.

What do we see from the results? The Chi-square calculated for p-value of 0.05 (95% confidence) is 2.82. For the difference to be statistically significant the value has to be at least 3.84 (degrees of freedom =1).

Since that is not the case here, we see no reason to reject the null hypothesis that the difference is just random.

You can repeat this for their next chart that shows have sex at least 1x per week and you will find no reason to reject the null hypothesis.

So the BIG claim made by VentureBeat’s article and its colorful infographic is just plain wrong.

If you followed this far you can see that it is not easy to seek the right data and run the analysis. Most importantly it is not easy to question such claims from a popular blog. So we tend to yield to the claim, accept it, Like it, tweet it, etc.

Now that you learned to question such claims and credibly analyze it, go apply what you learned to every big claim you read.