Statistical Non-Experimental Knowledge in Business

Most of what qualifies as research-oriented business intelligence comes down to identifying (previously unknown) patterns. Typically an analyst or business user sees an anomaly and drills down charts and tables to investigate. Ideally to identify the cause on a level as granular as possible. Subsequently, action is taken. However, how scientifically correct is this approach?

So say that for an online toy store we’ve identified that sales for a certain toy are particularly high for Chinese women aged 20-25. How can we know if this is a regularity, or that it merely happened by chance? The answer is that we cannot know for certain based on the current data alone. At this point it is still a hypothesis, inspired by a merely exploratory analysis. Note that we can not use our historical data for this comparison, as this data generated our hypothesis in the first place. We would need to set up an experiment to test it. One way to do this is to compare, from now until a certain date in the future, the sales of the toy for Chinese women aged 20-25 with (a sample) of the rest of the population.

Many analytics and business intelligence products are lacking in this respect, because 1) they have no support for testing hypotheses at all, or (more commonly) 2) they do not have the workflow systems in place for testing on new data. So the reality is that in most analyses the final step of testing is stepped over. This leads to acting on spurious patterns and in effect basing decisions on thin air. It is important to educate analysts and business users that being “data-driven” entails more than making decisions based on just looking at numbers and pretty visualizations.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>