Whale of a Sample Size in Statistical Testing

Australia is taking Japan to court to stop Japan from killing whales in the name of scientific testing.  The whales that are captured and killed for “research” are later sold as food. In a year, Japan harpoons and kills about 1000 whales for their research work.

What is this gotta do with  statistical significance?

We have to go all the way back to 2005, when Japan implemented what it called JARPA-2 against the wishes of International Whaling Commission. By JARPA-2, Japan increased Whale intake from its then sampling rate of 440 whales to 1000 whales.

We will implement JARPA-2 according to the schedule, because the sample size is determined in order to get statistically significant results

When everything else is held constant, increasing sample size from 440 to 1000 will increase statistical significance because of the way the standard error SE is computed. SE that measures sampling precision goes from  σ/√440 to σ/√1000 a lower number that almost guarantees statistical significance. (see reference)

Under the cloak of statistical significance more whales are being sampled without regard to the economic significance and ecological significance.

Consider this in the context of your A/B testing. Yes even minor differences will appear statistically significant by the magic of large samples. But statistical significance means is not sufficient, we need to ask do these differences have economical significance? Should we chase these tiny differences and lose the opportunity to get the rest of the 97% who are not converting?