Analytics is Incomplete without Human Judgement
Analytics is fun. Exploring huge amounts of data can be an incredible exciting journey, but it can also be a monumental waste of time and resources if the results fail to solve real-world problems. Today, companies are at an advantageous position with access to robust and sophisticated analytic tools. However, mere tools are not enough for Big Data analytics to deliver business results. My summer internship at the highly analytical environment of Amazon, and an exchange semester packed with Marketing Analytics courses at Kellogg School of Management have taught me that analytics is incomplete without human judgement and expertise. Effective descriptive, predictive and prescriptive models can be developed under the direction of managers who have deep business knowledge and understand how to distinguish good analytics from bad analytics. To illustrate my point, let me tell you a story.
Good or Bad Analytics?
Recently, the Chief Data Scientist of an Automobile manufacturer shared their online marketing results to the senior management in a quarterly update meeting. The marketing results tracked a consumer’s ad exposure on Google and then linked it with information about whether the consumer ended up buying the car from one of their dealerships. The results tracked four groups of consumers: The first group saw no automotive ads. The second group saw ads from dealers only. The third group saw ads from manufacturers only. And then the last group saw ads from both manufacturers and dealers. The conversion rate was less than 1 percent for consumers who saw no auto ad. It was around 3 percent for those who saw dealer ads only. For consumers who saw manufacturer ads only, the rate was about 5 percent. But for those who saw both manufacturer and dealer ads, the conversion rate jumped to 14 percent. How would a layperson interpret these results? Google advertising is effective. Additionally, manufacturer and dealer ads together result in higher conversion that just individual ads. However, the analytical manager would ask the question: Is this Good Analytics or Bad Analytics?
Begin with Causality
To begin with, managers at this meeting should ask: Are consumers buying cars because of Google advertising? It is imperative at this stage to establish causality. How does Google advertising work in the first place? It’s based on search words. If you search for a particular topic, Google remembers it, and then shows you ads relevant to your search history. After all, if you are not interested in buying a car, you won’t search on Google for the car, you won’t see ads on Google, and you won’t buy a car. This is a simplistic story to illustrate the nature of bad analytics. We could trust this data if the groups of consumers were identical except for the fact that one group saw the ad, and the other did not. Only then we could attribute the success of conversion to trigger of Google advertising. Analytic models can tell stories, however, it is up for us humans to interpret them, judge them and then eventually trust them.
A Checklist to Uncover Bad Analytics
Analytics can seem daunting. Statistics, algorithms, models, tricks. Phew! Fear not, take a deep breath. Here’s a simple checklist that can nearly always uncover bad analytics you encounter in your career.
1. Begin with understanding the data-generating process.
Did you conduct an experiment to obtain your dataset or is this data your firm has on its customers and their purchase behavior? Is this data completely reliable? Can this data be supplemented with other data from another source? Once you understand the data-generating process inside-out, it becomes easier for your judgement to identify any potential mistakes.
2. Are there pre-existing difference between groups?
The next step is to understand the consumer groups whose behavior we are trying to analyze or predict. Are the groups probabilistically equivalent? Are they identical except for the fact that they took a certain action that we are analyzing? Are group membership and outcomes both responding to a common factor? Did we take any actions that have impacted the outcome? This information is the foundation of a sound analytical project.
3. Is there a confound that could explain the outcome?
Wikipedia describes a “confounding factor” as an extraneous variable in a statistical model that correlates (directly or inversely) with both the dependent variable and the independent variable, in a way that “explains away” some or all of the correlation between these two variables. Now, that’s a mouthful. This can be understood through a classic story from Harvard University professor Gary King, director of the Institute for Quantitative Social Science. A Big Data project was attempting to use Twitter feeds and other social media posts to predict the U.S. unemployment rate, by monitoring key words like “jobs,” “unemployment,” and “classifieds.” The group used sentiment analysis to see if there was a correlation between an increase or decrease in them and the monthly unemployment rate. While monitoring them, the researchers noticed a huge spike in the number of tweets containing one of those key words. But, as King noted, they later discovered it had nothing to do with unemployment. “What they hadn’t noticed was Steve Jobs died”.
Intuition IQ + Analytics IQ = Good Analytics
To conclude, lacking good judgement, big data can lead to bad decisions and bigger disasters. As we encourage a culture of analytics and fact based decision making, we ought to also encourage intuition and insight.
2. Case Study – Harmony: Advertising Effectiveness, Professor Florian Zettelmeyer, Kellogg School of Management
3. Customer Analytics, Kellogg School of Management