Consider three examples. The first involves Billy Beane, front office executive for the Oakland Athletics and the subject of Michael Lewis’s book Moneyball, who transformed baseball using data. He didn’t do it using fancy new math, or even sophisticated statistical work. He did it by asking an important question: What kinds of players in the Major League draft typically go on to have the most successful professional careers? He used years of data to answer that question, and then drafted players with those attributes (e.g. those who were playing in college, drawing lots of walks, and so on).
Beane’s insight was not some kind of arcane statistical manipulation. It was much simpler. He recognized that he could predict who would succeed in the Major Leagues by studying who had succeeded in the Major Leagues. That’s just exploiting a pattern. In the same vein, the logic behind Moneyball can be applied to any business—there’s enormous potential to use data more powerfully without spending years studying statistics or using complicated algorithms. The essence of “big data” is much simpler: Ask an important question, find the data that might offer an answer, and figure out the pattern.
Example two is out of law enforcement. When police in Santa Cruz, California, claimed that they had “solved a crime before it happened,” it was not some futuristic, Orwellian crime fighting strategy. It was just a pattern. The Santa Cruz police used crime data to determine when and where crimes were happening most often. Then they sent more officers to these locations. One of these spots was a parking garage where there had been a large number of break-ins. Officers spotted two suspicious-looking women lurking near a car. One of the women was wanted on an outstanding warrant; the other was carrying drugs. Police arrested them both—ostensibly before they broke into the car.
Did the police really solve a crime before it happened? The question misses the point. The Santa Cruz police used data to spot crime patterns and then sent officers to the places where they would have the most impact. That’s not mathematical genius; it’s just clever use of data.
Lastly, when the retailer Target wanted a tool for reaching pregnant shoppers (who tend to develop strong retail loyalties during pregnancy), analysts developed a “pregnancy prediction index.” This was neither as hard nor as intrusive as it would appear. Target already had the relevant data. The retailer has a baby gift registry for expectant mothers—women who had effectively told Target not only that they were pregnant, but when they were due. Analysts studied the shopping habits of these pregnant women to discern what products they were more likely to buy than non-pregnant customers: baby wipes, unscented lotions, vitamins, and a handful of other products, some more obvious than others. The next step was just a logical leap: Women who begin buying these products are likely pregnant and can be targeted (pun intended) for pregnancy-related products and services. That is clever business, not statistical wizardry.
Of course, Target faced significant blowback when their pregnancy prediction index figured out that a high school girl was pregnant before her father did. (A series of pregnancy-related coupons from Target prompted the dad to ask some pointed questions of his daughter, according to a New York Times story on Target’s data analytics.) This is a good time to remind your team that big data requires judgement, too. Some patterns are better left private.
Most of the time, however, customers benefit enormously from well-targeted information. We want recommendations for products we are likely to enjoy, discounts for services we use, and customer service that has been refined by constant feedback. And your employees have the power to deliver these benefits, even if they consider themselves “non-math types.”
That’s because the revolution in data analysis in the last 15 years has been made possible by three things: digital data, cheap computing power, and connectivity. Fifty years ago, baseball teams had loads of statistics on player performance—but it was written in pencil in binders filed away in dank storerooms. The same was true with crime data and credit card receipts and customer satisfaction surveys. We had the information—but there was no easy way to compile and analyze it. The patterns were there. We just couldn’t see them, at least not cheaply or easily.
Then along came the personal computer, digitalization, and the internet. Suddenly, we could suck all that information out of basement storerooms and moldy ledgers and see the patterns lurking— free, and within seconds. Once data became more valuable, we began collecting more of it: with loyalty programs, on social media, from scanner data, and so on.
The gating factor now is imaginative questions, not proper computations. Anyone can learn to ask great questions. Take this one: What kinds of people turn out to be the best at sales? That’s just the Billy Beane question again, only for sales rather than for baseball. You can use data to identify the attributes that define top performers, and then hire people with those attributes.
To be clear, it’s hard to measure and quantify important skills like “listening;” the data must be collected over a long enough period to separate luck from skill; and so on. Still, the process of putting data against questions can burst myths and overcome stereotypes. This was one of Billy Beane’s first insights. His scouts were looking for talent based on “rules of thumb” that weren’t borne out by the data. For example, they were enamored of pitchers who could throw superfast, even as decades of data showed that accuracy matters more.
There are a few other caveats to keep in mind as well. First, big data tends to produce patterns, but it is not deterministic. Billy Beane is going to draft some duds; not every customer buying baby wipes and unscented lotion is pregnant.
Second, all data are inherently backward-looking. By definition, they come from the past. Because of this, data analytics will miss inflection points. Customers cannot provide meaningful feedback on a product they can’t imagine.
Third, sloppy thinking is just as dangerous with data as it is without—maybe even more so. Yes, customers who call a complaint line report low levels of satisfaction with the service they get because it’s the complaint line. The right questions to ask are: 1. Are customers more satisfied (even if still angry) at the end of the call than they were at the beginning; 2. Which employees have the most success in improving customer satisfaction? and 3. What techniques do those successful employees use?
It’s true that basic statistics are what gives power to the patterns; knowledge of basic statistics is an important skill to have. Still, I would rather teach a savvy marketing person how to do basic data analytics than try to get a statistician to think about improving the customer experience. Interesting answers are out there. People who care about those answers just need to go looking for them, maybe with a little bit of prodding. The easiest way to get all of your employees excited about using data is to demystify what is actually going on.
Author: Charles Wheelan is author of Naked Statistics, Naked Economics, and Naked Money.