posted 28 Aug 2007 in Volume 11 Issue 1
EI feature: Data quality
Poor data quality can infuriate customers and staff alike. But organisations are, at last, beginning to put in place technology and processes to deal with the problem.
By Jessica Twentyman
At one time, telecoms giant BT had a serious problem with data quality. Now, the company is so confident in its ability to stamp out inaccurate, inconsistent and duplicated information in its myriad of corporate systems that it offers advice and guidance to other organisations on how to clean up their act through its consultancy arm, BT Global Services.
Getting to that point has been a ten-year journey, but the confidence that BT staff now have in its corporate data is estimated to have saved the company in the region of £600m in the past decade and has vastly increased customer satisfaction rates, according to Dave Evans, senior data-management consultant at the company.
“No customer wants to do business with a company that consistently gets their name wrong,” he says. “We’ve eliminated easy mistakes that cost us money – such as an engineer turning up to the wrong address. Mistakes like that really demonstrate the importance of data quality,” he says. A key factor in the company’s success, he adds, has been its use of specialist data-quality tools from Trillium.
Likewise, at imaging giant Xerox Europe, data-quality tools from DataFlux are being used to standardise data held by each of its 16 in-country operations, prior to loading it into a new, single-instance enterprise resource planning (ERP) system from SAP.
“On average, we’ve found that, in the best countries, about one-tenth of the data is duplicated. In the worst countries, it’s about one-third,” says Andy Bloomfield, SAP deployment manager at Xerox Europe.
“We need to consolidate that all into a single, reliable source of information – but that’s quite a big operation, so we need all the automation that technology can provide,” he says.
Like BT and Xerox, many companies find that inconsistent data is not just inconvenient – it’s a major impediment to business agility and competitiveness.
Yet many continue to suffer from this problem, in which information about the same customer or product, for example, appears in multiple systems and formats across the company, but simply doesn’t tally from system to system. This undermines reporting initiatives and can seriously impede managers’ efforts to make sound strategic decisions.
The situation certainly can’t be pinned on a lack of effort or technology investment, however.
On the contrary, organisations spent millions throughout the 1990s, on ERP suites that promised to provide a central, consistent set of enterprise data. The problem is that the vast majority of businesses found that they still needed to implement other, additional software products each with their own databases, data formats and, frequently, their own version of data that appears elsewhere.
Today, more and more businesses recognise that information is a valuable corporate asset, a vital tool in the struggle to improve customer service, identify new business opportunities and shorten the sale-to-cash cycle. That recognition is driving many of them to invest in data-quality tools that enable them to establish data consistency and quality.
By definition, quality data is accurate, relevant and up-to-date. Data-quality tools help in the identification of data that does not meet these standards, enabling either its cleansing or its removal from corporate databases. By implementing these tools, goes the theory, managers can have greater confidence that the data they have will equip them to make better-informed business decisions.
Broadly speaking, data-quality tools fall into two broad categories: tools for data profiling and tools for data cleansing.
Data profiling enables a company to get an understanding of its data: what information is stored, where, its structure and the anomalies and inconsistencies it contains. These might include invalid data structures, incorrect values, missing values, duplicates, misplaced fields or inconsistencies between values stored in different systems. Trying to uncover such problems manually is time-consuming, prone to error and expensive.
With a clear understanding of data quality problems gained from data-profiling tools, data managers know precisely what corrective measures and cleansing rules need to be applied. Data-cleansing software applies these rules automatically and ensures that the data is standardised and corrected: duplicates are removed, structures such as field lengths and formats are standardised, and values are corrected.
The range of vendors offering these products, meanwhile, grows ever broader. In the most recent Gartner Magic Quadrant for Data Quality, some 15 suppliers are named: Business Objects, Datactics, DataFlux, DataLever, DataMentors, Datanomic, Fuzzy Informatik, Pitney Bowes Group 1 Software, Human Inference, IBM, Informatica, Netrics, Trillium and Uniserv.
A company-wide effort
Regardless of what product they ultimately select, most veterans of data-quality projects agree on one point at least: data quality involves a serious company-wide effort, not just by the IT team, but also a team of information-management professionals and knowledge workers scattered around the organisation.
However, many organisations fall prey to a particular common mistake, says Ted Friedman, an analyst with IT market research company Gartner. “They view data as a technology-oriented problem and rely on the IT department to ensure the security, availability and quality of data, treating data as a ‘necessary evil’, rather than as an important corporate asset,” he says.
In fact, he continues, data quality is a business issue, “in terms of its impact and the optimal approaches for addressing it”.
As a result, it’s the business, not the IT department, that should define what is ‘good enough’ quality, but most organisations require a significant degree of culture change to reach the point where data-quality improvements are driven and supported by the business, he says.
At the same time, the business needs to realise that optimal improvements to data quality cannot be achieved through technology alone, says Friedman. “A combination of organisational and process improvements will be required to deliver the maximum positive impact and to ensure long-term maintenance of acceptable data-quality levels,” he says. For that reason, companies embarking on a data-quality exercise should assign business analysts and business-unit managers to data stewardship roles, in order to place accountability for data-quality issues emphatically with the business, rather than with IT.
“Data quality must be supported by an organisational structure that encompasses people from virtually every line of business, including sales, marketing, service, production, finance, human resources and IT. This organisation is usually assembled virtually so that the data-quality sponsor – the head of the initiative – has access to people throughout the organisation,” says Friedman.
“Members of the virtual data-quality team should be considered subject-matter experts of the department or area that they physically belong to,” he adds. For example, a marketing specialist from the company’s marketing department may act as the data steward in the data-quality organisation, with the responsibility to ensure that marketing-relevant information adheres to the corporate data-quality standards. This might include information attributes, such as completeness, correctness, consistency, integrity and non-duplication.
Technology, however, goes a long way in helping business users clean up data at speed. At BT, for example, data profiling had historically been a manual effort that was time-consuming and people-intensive, until the company implemented Trillium’s TS Discovery data-profiling tool. This not only enabled BT to place control in the hands of business users best positioned to identify links between data anomalies and process inefficiencies, but also increased data-analyst efficiency dramatically.
The ability within TS Discovery to view profile results together with actual data rows, for example, helped business users see more clearly the impact of data issues, while the product also helped data-quality specialists to quickly prototype cleansing rules.
BT’s success in tackling its data-quality issues, moreover, has led to more ambitious initiatives at the company. In recent years, it has moved towards master data management (MDM), in which customers’ names and addresses are held in a single ‘master copy’ database called NAD – the Name and Address Database.
The data NAD holds is standardised and cleansed in real-time, enabling the repository to act as a single, authoritative source of accurate, up-to-date customer information that populates hundreds of applications around BT, explains Dave Evans. Data profiling and cleansing, he adds, was just the start of a far more visionary journey – “and a highly worthwhile one at that”.
Jessica Twentyman can be contacted by e-mailing firstname.lastname@example.org
Case study: Irish Life & Permanent
Mergers and acquisitions frequently create data quality hassles, especially where two customer databases need to be consolidated.
That was the case for financial services company Irish Life & Permanent, formed from the 1999 merger of insurance company Irish Life and the Irish Permanent bank.
“While the two businesses – life insurer and bank – continue to operate separately and have their own brands and offices, we nevertheless want to cross-sell to our combined customer base, and there’s a lot of crossover between businesses and products,” says Noel Garry, executive manager of IT strategy and planning at Irish Life & Permanent.
“If we have a Noel Garry on our life database and a Noel Garry on our bank database, it’s vital we know that we’re dealing with the same person,” he says.
As a result, the company has embarked on a massive clean-up operation, in preparation for implementing IBM’s WebSphere Customer Center product for master data management (MDM), which will enable it to establish a single record of its entire customer base.
“At first, we thought we might simply load data from the ‘bank’ file into the ‘life’ file, but after a bit of exploration, we got talking to Gartner about MDM and we realised that this would provide us with a more long-term answer to the challenge of keeping our data consistent,” says Garry.
A massive data clean-up is now underway, using IBM Information Server to bring together matching records and consolidate them. When agreed thresholds for matches are not reached, the system marks the records as ‘suspect’ and flags them for human examination.
Using that approach, the group has already identified 117,000 exact duplicates – where there are two or more records of the same customer – within the Irish Life database alone.
Now that those duplicates have been eliminated and the rest of the data has been cleaned up, says Garry, the group can start to load that database into WebSphere Customer Center.
The data held in the bank database, meanwhile, will also be cleaned and loaded into the MDM system towards the end of 2007.
“Once that’s complete, every time a new customer is entered into any of our systems, that system will reference WebSphere Customer Center and use the hub to standardise the data captured,” says Garry. “And if a customer calls into the bank and tells us their mobile-phone number has changed, the new number is automatically distributed by the hub to all the systems that hold information on that customer.”
MDM, he emphasises, it just a starting point for Irish Life & Permanent. “It will be a springboard for all types of new initiatives, which is why we see this very much as an enablement project for all types of improvements to the services we can offer our customers,” he says.
The rising profile of data-quality tools
A number of factors have promoted data quality to the top of the boardroom agenda in recent years, according to analysts at IT market research company, Forrester Research:
Regulatory compliance issues, such as the Sarbanes-Oxley Act, have refocused attention back to information quality and have shifted its importance into the corporate boardroom. Increasingly, missed deadlines in closing corporate accounting books and statutory reporting have been blamed on data defects and quality issues.
Poor quality of CRM systems
According to Forrester analyst Lou Agosta, information quality is the weak underbelly of customer-relationship management (CRM) implementations and many fail to deliver the accurate, 360-degree view of the customer the vendors of such systems originally promised. Information-quality tools need to identify individual customers across multiple datasets and help eliminate duplications.
Bad data creates costly, operational inefficiencies
Duplicated customer or product data creates redundant information that impacts all downstream processes that use it. Backups, system interfaces and repeated verification of the same data increases the cost of daily storage management processes. Productivity is also hit as tasks are repeated.
Mergers, acquisitions and reorganisation require data integration
Mergers and acquisitions of companies create critical compatibility issues between different information technology systems. If data is not carefully inventoried and evaluated, there is the risk of dysfunctional islands of information and data silos being created.
Loss of trust
A lack of data and information quality across systems reduces the value of all systems to employees as it becomes difficult for them to judge which one is accurate.
Source: Forrester Research
Table One: The impact of poor-quality data
Sales and marketing
Low customer satisfaction;
Many address change requests;
No trust/agreement in reporting.
Excessive mailing expense;
Cost of lost sales.
Budgets take forever to get ‘right’;
Big budget/actual discrepancies;
No trust/agreement in reporting.
Fines and possible jail time
under the latest regulations.
‘Out of stock’ situations;
No trust/agreement in reporting.
Reduced top-line revenue;
Excessive shipping expenses.
Large IT projects fail;
Low use of applications;
No trust/agreement in reporting.
IT investments wasted;
Productivity remains low.
Source: Forrester Research