Let us hunt for something interesting in this data gold mine

How many times have you heard this?

We are collecting a lot of data on our customers/transactions/sales/logs,  let us look at this goldmine to see if we can find anything interesting .

The problem with seeking something interesting is you are bound to find it. You might call the next statement tautological, but the fact is if it is interesting you are bound to find.  To a determined data-miner any interesting statistical outlier will eventually show up, then it is simply writing the hypothesis prediction.

Data mining or as some might call it data trolling is looking for  patterns from data sitting around, as opposed to deliberate decision making which requires seeking specific information for reducing uncertainty. But data can fit any number of hypotheses. However mining for a cause, we are bound to pick

  • Ones that are most convenient, like the man searching for lost key under the light.
  • Ones that are familiar, based on our past experience and our beliefs – well there are many fables about this.

The way to make informed decisions is to frame hypothesis based on the best prior knowledge we have. Know that this is just an hypothesis, not a fact and has uncertainties associated with it. Then collect specific data to refine it and reduce the uncertainty.

We will never know all the facts with certainty, but if we realize that what we know has uncertainty associated with it and there could be far more that we do not yet know, we are on the right track.

How do you make your decisions?

Bye Bye Mr.Memory! Hello Mr. Insight

In the Alfred Hitchcock film 39 Steps, the opening scene features a stand-up act by a man introduced to us as Mr.Memory. People paid money to come to this show. He was someone who had committed at least 50 facts per day into his memory and could answer any audience question. The questions range from the distance between Winnipeg to Manitoba to baseball statistics. Today we do not need Mr.Memory nor do we  appreciate committing facts into memory. We have Google, or bing or the next big search engine.

If you look closely at Mr.Memory’s act it was still an entertainment act. If it was a rote regurgitation of facts the audience would not have paid good money to get there. He was witty and the audience was laughing.  Mr.Memory would have bombed if the audience were bored or laughing at him instead of at his jokes.

Google or not,  data, information and facts have  always been available to those sought them. Data might not have been free or there were transaction costs but was available. People protected data as if the value is intrinsic to the data. Value  is not intrinsic to data. Value is created from the insights one derives from these to serve a market and gain upper hand over the competition.

There is a quote that was attributed to Sam Walton (I cannot verify the authenticity): “I am not so much afraid of someone stealing my data as someone can make better decisions with it than I can”.  Whether or not Sam Walton said this the statement holds true.

Mr. Memory  may not have  job today but he knew then that his advantage came from doing something different with the information – delivering entertainment that competed for customer wallet share against other forms of entertainment.

Do you, as a decision maker, just seek data for its own sake or create actionable insights that deliver profits?

Fact Based Decision Making

IBM acquires SPSS:

“The opportunity is to move from sense-and-respond decision-making to a predict-and-act model,” said Ambuj Goyal, a computer scientist who is the general manager of I.B.M.’s information management business.

The growing appeal of the sector, Mr. Davis said, reflects the increasing pressure by the senior management of corporations for “tighter, fact-based decision making, especially in this economy.”

Other independent analytics software makers may well become takeover targets, said Mr. Evelson of Forrester. Among the candidates, he said, are Accelrys, Applied Predictive Technologies, Genalytics, InforSense, KXEN and ThinkAnalytics.

The broad consolidation wave in business intelligence software, analysts say, will bring increasing price pressure on some segments of the industry as major companies seek to increase their share of the market. And the open-source programming language for data analysis, R, is another source of price pressure on software suppliers.