What Data Says on Increasing Retweetability of tweets by adding links

TLDR: Data scientists at IterativePath have analyzed tweets over weeks and break the myth of causality of links in tweets to its retweetability.

Admit it you worry about getting your tweets retweeted.  Hopefully  many times, but you will settle for just once. You wish someone lot more famous will notice and retweet your sparkling thought.You come up with something original, innovative and awesome. After all it is only fair that the world sees it and appreciates it by spreading it.

This applies to individuals or brands (which essentially have interns with fancy social media titles manage tweets).

So you look for ways to increase retweetability – what knobs can you turn.As you do a google search you chance upon this article titles, “scientifically proven ways to increase retweetabilty“. One such proven way from that article is:

Adding links to a tweet increases its retweetability.

That article convinces you to arrive at this conclusion (because you have willfully suspended skepticism) by stating this observed data (mind you, from analyzing 10s of thousands of tweets)

  1. Among the tweets that were Retweeted, 56.7% had a link in them
  2. Among the tweets that were not Retweeted, 19% had a link in them

If you went back and read the article I did on these numbers you will see this does not say anything about retweetability.

I decided to do my own testing. Unlike others who do data dredging I practice data science so I started with the hypotheses

Null hypothesis H0: Any difference found in retweetability of tweets with links and no links is just randomness.

Alternative hypothesis H1: Links do make a difference in retweetability.

Method: I will be non-parametric test — Chi-Square test for test for statistics. This test of statistical significance is non-directional. That is  it is not going to tell us which way the difference favors but only that the difference is statistically significant or not.

Data collection:  Randomly collect a sample of original tweets (that is excluding those “RT @SomeOne …”) and analyze them. I used python-twitter API to collect tweet data and metadata like retweet count, links in it or not etc. I collected about 3500 samples — good enough. In fact too large.

Data Summary:   Here is what the data stands

No Link Link Total
No RT 2248 1072 3320
RT 177 44 221
Total 2425 1116 3541

First observation is just about 6% of any of the tweets got even a single retweet. Vast majority of your original tweets get no love whatsoever.

But that was not the hypothesis. So let us test our hypothesis with data in row 2 by calculating Chi-square value.



The difference is indeed statistically significant. That is links do make a difference. But note what I stated about chi-square test being nondirectional. So you need to look at the data and apply mind to see which way is the difference.

You can see that just 20% of retweeted tweets have link in them vs. 80% have no link (row 2 of table above).

More importantly look at column 2. Of those tweets with links, 96% of them have 0 retweets. And just 4% were retweeted.

So links make a difference for the worse, breaking the myth propagated by any previous articles on this.