Statistical Non-Experimental Knowledge in Business

Most of what qualifies as research-oriented business intelligence comes down to identifying (previously unknown) patterns. Typically an analyst or business user sees an anomaly and drills down charts and tables to investigate. Ideally to identify the cause on a level as granular as possible. Subsequently, action is taken. However, how scientifically correct is this approach?

So say that for an online toy store we’ve identified that sales for a certain toy are particularly high for Chinese women aged 20-25. How can we know if this is a regularity, or that it merely happened by chance? The answer is that we cannot know for certain based on the current data alone. At this point it is still a hypothesis, inspired by a merely exploratory analysis. Note that we can not use our historical data for this comparison, as this data generated our hypothesis in the first place. We would need to set up an experiment to test it. One way to do this is to compare, from now until a certain date in the future, the sales of the toy for Chinese women aged 20-25 with (a sample) of the rest of the population.

Many analytics and business intelligence products are lacking in this respect, because 1) they have no support for testing hypotheses at all, or (more commonly) 2) they do not have the workflow systems in place for testing on new data. So the reality is that in most analyses the final step of testing is stepped over. This leads to acting on spurious patterns and in effect basing decisions on thin air. It is important to educate analysts and business users that being “data-driven” entails more than making decisions based on just looking at numbers and pretty visualizations.

New site!

Hi everbody! Finally, I’m doing something with this domain. It’s been mine for over a year now, but never really put stuff up here. Now is the time. Some of the things I intend to publish here:

  • My thoughts and interesting links on behavioural science, applied statistics, and empirical skepticism
  • How the above translate into business opportunities
  • Academic publications
  • Projects I’m working on, especially in R (the programming language)

This is the site as it is now. As for the exact makeup and direction, I’ll figure that out as I go. Stochastic Tinkering, as Nassim Nicholas Taleb would call it, author of my all time favourite “The Black Swan: The Impact of the Highly Improbable”. A reference I will call upon frequently for sure. So stay tuned, more will follow soon!