Break the bad data habit

A district property manager, while going over the budget numbers for tomorrow's big presentation, notices something odd. She asks her assistant to check it out. He digs in, finds the error (in data supplied by the Widgets Department), corrects her presentation, and sends her an email advising what he's done.

216

Her presentation is well-received, with that figure emerging as a linchpin in the subsequent discussion. She’s so happy that she makes an on-the-spot award to her assistant and sends him home early, advising him to enjoy a night on the town with his new girlfriend! On his way out, she observes, “You know, we should be sure to check numbers from the Widgets Department every time. I could have gotten killed in there.” He agrees to do so.

A possible is disaster averted-and turned into big win!

But is it so simple? Note that neither the executive nor her assistant explained their discovery to the Widgets Department. At a minimum, others using the erred data may not spot the error. There is no telling where it might turn up or who might be victimized. In the longer term, they have denied the Widgets Department a potentially important insight and the chance to get to the bottom of the problem. More subtly perhaps, they’ve taken responsibility for quality- a job they have neither the time nor skill to do well.

There are two interesting moments in the lifetime of a piece of data: the moment it is created and the moment it is used. Quality, the degree to which the data is fit-for-use, is judged at the moment of use. If it meets the needs at that moment, it is judged of “highquality.” And conversely. The whole point of data quality management is to connect those moments in time-to ensure that the moment of creation is designed and man- aged to create data correctly, so everything goes well at the moment of use.

It sounds simple, and when it works, it is. But too often, vignettes such as the above are the norm. I could just as well have based the story on a military commander’s singleminded focus to “complete the mission.” Or the delivery man’s desire to get the package to the customer, no matter what.

I also could have based the story on departments, not individuals. The billing department spends time and money to correct data from operations, and customer service spends much of its time dealing with customer claims of billing errors. Finance “checks everyone’s numbers.” And so forth.

To be clear, I don’t blame the individuals or departments cited above. Quite the contrary. I admire their dedication to deliver the right facts to management, to complete the mission, or to satisfy the customer.

But soon the failure to provide feedback becomes a habit, ensuring that such vignettes repeat themselves over and over. A dangerous habit at that!

And failure to provide feedback is but the proximate cause. The deeper root issue is misplaced accountability-or failure to recognize that accountability for data is needed at all. Missing or misplaced accountability on an organization-wide scale betray a management problem! And one that only senior management can address.

Organizations must address data quality head-on, implementing policies, creating organizational structures, and advancing cultures such that:

Data creators create data correctly, the first time, with full understanding of what that means to customers, those who use data they create.

Data customers must communicate their data requirements to sources of data, and they must provide feedback when data are wrong.

Virtually everyone recognizes they are at once data creators and data customers.

There is, of course, a lot more to data quality management. But let’s not make this any more complicated than it needs to be.

People and departments must continue to seek out and correct errors. They must also provide feedback and communicate requirements to their data sources, and be mindful of and create data to meet the “next person’s” requirements. For a time, these add to workloads. But steps like these quickly pay enormous dividends, in the form of data we can trust.


Author: Thomas C. Redman, Ph.D