Feature
posted 26 Apr 2007 in Volume 10 Issue 7
Workshop: Master-data management
Managing master data, part one
Master-data management – managing key sets of data centrally instead of in application silos – is becoming a element of information management. But where to begin?
By Mike Fleckenstein
It is not uncommon today to hear people discuss master-data management (MDM) in IT communities and it is catching on in business circles, too, and with good reason: data has become a very valuable asset, but one that has always proved difficult to manage.
Major organisations are increasingly recognising this – especially the value of customer data. These organisations know that focusing on key metrics will help them improve the bottom line and that the level of customer care – a very key metric, indeed – can be increased by understanding and catering to the behaviour of their customers.
But to do that, they need to get that customer data right.
As the volume of data under management continues to increase, ordinary companies are plagued with the problem of managing inconsistent sets of data – is the customer Andrew Smith or Andy Smith? Or are they two separate customers?
Such data questions are challenging enough to answer in an unchanging business climate. But in today’s rapidly changing landscape, companies grow, acquire rivals, merge and de-merge at a startling rate. The corresponding IT systems, likewise, need to be scaled up, and merged and de-merged at a similar pace.
Furthermore, there is mounting pressure to report key metrics at the corporate level accurately and consistently – while all this frenetic activity is going on in the background.
Similarly, how customer care is applied and its effectiveness can influence a company’s bottom line. Hence, the need for ‘master data’ – a single data set of, for example, customers, that can be referenced by all systems that need it.
The art of MDM
While there are a number of alternative approaches to MDM, the central definition remains consistent. Master data can be defined as the ‘golden copy’ or ‘single version of the truth’ of information required to create and maintain an enterprise-wide ‘system of record’ for core business entities spanning all organisational business functions. MDM can be broadly defined as the process to support ongoing integration of master data across the enterprise. In terms of data, then, where does master data fit in?
Roughly speaking there are three types of data. They are:
Transaction data
These are measurements taken at a particular point in time. For example, dollars earned or units sold. Typically these data-snapshots are stored and lend themselves to trend analysis;
Metadata
This is information about a particular data set which may describe, for example, how, when and by whom it was received, created, accessed, and/or modified and how it is formatted;
Reference data
Reference data uniquely identifies an entity. ‘Customer’, ‘branch’ and ‘product’ are good examples. This type of data is often inconsistently and redundantly stored within an organisation.
The art of MDM pertains to the method and structure used to house a ‘single version of the truth’ for key reference data within an organisation.
To take it one step further, master data can be distinguished from static or reference data. Reference data can be viewed as items that don’t change over time or within the organisation,. For example, US states. These will be defined the same way in all systems for the foreseeable future.
Master data, on the other hand, can change relatively rapidly over time or within a company. Products are a very good example of this – in most industries, they change in some way at least once a year. Likewise, many customers will change their addresses within a year and may even use interchangeable variations on their name in correspondence – Andy or Andrew, for example.
Master-data management must also be distinguished from a data warehouse. Both contain cleansed, standardised data, but the difference is that in MDM, the data is pushed back to the source systems. That is to say, it is in everyday use. A data warehouse typically cannot assess data relationships across source systems dynamically.
Data stored in the data warehouse may also be at a different level of granularity than what is required in the source system. It is, though, possible and reasonable for a data warehouse to be connected to the master-data layer. In a nutshell, having a master-data layer has the following additional benefits:
- Master data is dynamically fed back to the source system and available for analysis and reporting;
- Source systems can augment standardised enterprise master-data with local data, needed for their reporting;
- It enables a ‘plug and play’ approach in which source systems can be replaced individually as they age and their replacements integrated with the master-data layer.
MDM defined
The speed of change in business not only drives the need for master data, but also the need for master-data management.
Successful companies are typically flexible and adaptable. They are able to forecast more effectively, are therefore better able to make operational decisions and more readily achieve compliance with legislative mandates. It should be clear by now to anyone working with data that data accuracy helps greatly in these areas.
The force behind greater data accuracy lies in the coordination of how data gets into systems in the first place. A key component of MDM is data governance. This determines what data should be defined and how, how exceptions are handled and how data will change. From the CEO downwards, each organisational layer has a reason to manage data in a certain way. For example, finance will want to see what has been paid for, operations will need to see what has been sold and sales will want to highlight what is in the pipeline.
Effectively governing data requires a partnership between the business community and IT. Business people have a much deeper understanding of procedures and business rules. They are the people who must identify master data. So, just as a specific person or group of people in IT are responsible for implementing master data, specific individuals in the business must be given the responsibility (ie: ownership) for stewarding certain items of data. ‘Data stewards’ function as the point of accountability for a key data-subject area and work together with IT on a regular basis to identify, standardise and manage changes of master data.
Who acts as a data steward is a function of jobs or roles within the company and of company size. Roles required for MDM include: data requirements, data administration, data security, data quality, meta-data management and subject-matter expertise.
Typically, a company has positions for data requirements (analyst), data administration (DBA) and security (network administrator, system administrator and database administrator). Functions for data quality, subject-matter expertise and metadata management may also be assigned to individual positions, but are frequently blended into other jobs. An operational business representative, data analyst and database administrator may share the responsibility for (local) data quality, for example.
For a smaller company, MDM is less a function of having distinct positions for the above roles and more a function of clearly assigning ownership for these tasks to individuals. Obviously, an organisation of just 50 staff cannot afford an MDM team. However, individuals can be assigned responsibility to steward data. The job of data steward can even be launched from within IT to show its value and eventually transitioned to the business side.
For a larger organisation, having a formal MDM group is realistic and important. It should be comprised of individuals from key lines of business, as well as IT. It should meet regularly with corporate executives and include additional subject-matter experts and business people when necessary.
However, even in a larger company it is important to understand how existing positions fit into the roles required for MDM first. As an example of how important it is becoming consider the position of ‘chief knowledge officer’ that has been created in some organisations.
This position is typically responsible for defining all the rules and regulations around data.
Where to start
MDM will never apply to all corporate data. Nor should it. Each division of an organisation will work with a certain set of data that is very specific to its own operations and cannot effectively be shared. The trick is to determine what data should be shared. The other thing to keep in mind is that the MDM model is an evolving one. It is best to begin with a sampling of data, demonstrate results and then to expand the MDM effort.
As such, two MDM candidates groups come to mind. The first centres around key metrics and the second around customer-data integration (CDI).
Key metrics-related master data candidates are relatively quickly determined by analysing existing, high-level reports and then backing into the processes used to create them.
Bottlenecks in these processes, such as having to manually merge data in a spreadsheet to reconcile data, will become readily apparent.
Once a clear source for master data is defined this type of manual data-integration is reduced. The importance of key metrics-related data helps executives analyse and project performance and is therefore a good place to start.
CDI works well as a starting point if the organisation has multiple applications where customers have been, and are being, independently maintained. Since it is costly (and embarrassing) for companies to have incorrect and duplicate customer data executives are equally eager to encourage customer-data management projects. CDI as a starting point is particularly relevant in cases where a business is pushing (or would like to push) customer-data management to the customers themselves, online, since the integrity of customer data will become highly visible. Furthermore, the customers can help to manage and maintain the integrity of the data, thereby cutting costs.
The best approach is to begin small. While analysis of master-data candidates should span the enterprise (or at least a good part of it) to ensure that master-data candidates will remain consistently defined, implementation can occur in phases limited both vertically (in terms of the number of data items) and horizontally (beginning with just two systems).
It is also important to understand that MDM is more about the management of data than about the technology used. That said, there are alternatives to technology that must be considered prior to implementing the MDM layer.
Questions of how to design an MDM solution can be addressed at many levels. Two key questions to consider include: ‘Should the implementation be custom-built or off-the-shelf?’ and, ‘from among the many choices of off-the-shelf tools, data-quality tools, data-cleansing tools, data-integration tools and third-party content vendors, which ones make most sense?’
While MDM architecture is beyond the scope of this article it is easy to see that, particularly with the CDI approach, a data-cleansing tool or third-party interface will be essential to make sure that the master data is as accurate as possible before it is rolled out organisation-wide.
Custom or off-the-shelf?
Building a custom solution enables the organisation to tailor the application very specifically to end-user needs and to quickly incorporate new requirements. It is also less expensive, at least initially. A further advantage of a custom approach is an interface that cannot be replicated by an off-the-shelf (OTS) package.
For example, one CDI implementation at a large government agency displays a user-friendly ‘wizard’ to walk customers through a series of steps the first time they log on. This wizard enables them to identify themselves, merge duplicate records and to rectify their association to relevant cases, resulting in their user profile. The system then enables the user to manage their profile each successive time they log on. But there are trade-offs. As the master-data pool grows both horizontally (more applications) and vertically (more data) maintenance becomes more labour intensive.
Flexibility is also reduced as the application grows.
The biggest impediment to building off-the-shelf is the up-front cost. Just the MDM software alone can cost hundreds of thousands of dollars to purchase. But it demonstrates a commitment by the company that it is taking MDM seriously. MDM software packages are still maturing, but they are getting better and vendors are putting time and effort into these products. These packages offer a rules-based engine, data model flexibility, a workflow component, time variance and an application programming interface.
Either way MDM is not cheap. The key to minimising cost and maximising benefit is organising your approach towards MDM at the start. It helps to define MDM candidates before deciding on technology by rigorously decomposing processes and identifying bottlenecks first.
The bigger picture
MDM is closely linked to the service-oriented architecture (SOA) concept. The ability for an enterprise to link service functions to each other forms the basis of SOA. (see Splitting Headache, this issue) The trend for MDM (and SOA) extends beyond the individual firm.
Firms with separate computer systems must increasingly interact quickly. In the insurance industry, for example there is interaction between an insurer, agent/broker, re-insurer and regulators. But much time is spent formatting data to the satisfaction of an external party. It is not unrealistic to assume that, as the speed of business continues to increase, data standards across organisations or even industries will increase.
One concept that has already been developed along this line is straight-through processing (STP). This enables the whole trade process for capital markets and payment transactions to be administered electronically without the need for re-keying or manual intervention, which can be subject to legal and regulatory restrictions. While not industry-wide, the concept of STP is currently being applied between individual stock exchanges. The concept has also been transferred into other asset classes including energy trading.
The business case
Successful implementation requires a solid business case. As with any business case this should cover revenue improvement, new revenue generation, reduced costs, increased productivity, reduced risk, improved quality of service and increased employee and non-employee (ie: customer) satisfaction.
‘Hard’ benefits are financial in nature and can be quickly pinpointed by finding bottlenecks in processes involved in compiling master data. ‘Soft’ benefits, such as greater employee or customer satisfaction, typically require buy-in from users and managers alike based on previous successes. Such successes, like an example of how greater automation has helped in other ways, can bolster the business case.
No doubt executives are frustrated by some lack of master data. Whether it is customer, product or some other key reporting item the business case can use the following points to support its proposal to management:
- Master data provides a more rounded view of an organisation and its operations;
- MDM results in consistent reporting and better forecasting;
- MDM reduces the clutter of proliferated IT systems in an enterprise at various levels and overlapping sets of functionality;
- MDM makes an enterprise more readily compliant with legislative mandates;
- MDM supports the proliferation of the service-oriented architecture, in which ‘loosely-coupled applications’ are forced to share and report on the same data.
Another ‘soft’ benefit MDM provides is the embodiment of business rules. Convincing arguments can highlight that:
- Human experience is never exact. While experience and a ‘gut feeling’ are good, consistent data based on a rules-engine is better;
- There is a relatively high risk that key subject-matter experts will leave the firm, taking their knowledge with them. This throws business knowledge out of the window or worse, gives business away to the competition;
- Employers can be ‘held hostage’ by developers and subject-matter experts alike whose jobs it is to integrate data in a manual way. This can be costly, inaccurate and inefficient;
- The cost of manually assembling data (or programmatically integrating inconsistent data across multiple silos) for reporting purposes likely outweighs the cost of governing strategic data.
Mike Fleckenstein is principal analyst, business intelligence and data warehousing practice, at Project Performance Corporation (www.ppc.com), and leader of PPC’s insurance practice. He has more than 20 year’s experience developing and deploying data management solutions in both the public and private sectors, for clients around the world. Prior to joining PPC, Fleckenstein served as application manager at Medmarc Insurance and ran his own IT consulting firm, Windsor Systems Inc, specialising in IT and data solutions. He can be contacted by e-mailing, mfleckenstein@ppc.com.
Key steps towards MDM
MDM is not a solution that purports to cleanse and standardise all data. As mentioned above, it focuses on a system of record for core business entities. There will always be the need for departments to analyse detailed subsets of data in their own way. However, MDM presents a clear way to reduce the clutter. Below are some approaches that can be applied to make master-data management successful:
- Create a data-governance team and rules for governing data. Start small, but take the time to govern specific types of data, for example, product data. Ensure that the evolution of a data item allows consistent historical reporting. The team should be sponsored by senior management and include participants from both the business and IT side. Having the chief financial officer or another high-level person from your finance department leading this team, or at least an integral part of it, will yield the best results;
- Where possible update data in a single place. That does not mean marketing and underwriting (in insurance) can not have their own applications. Rather, it means that updates to data are made in one place (the ‘golden copy’) and then proliferated to other applications. If bi-directional updates are absolutely required govern them carefully;
- Put customer data in the hands of the users;
- Look at your key success measures. These must be standardised in terms of how they are calculated and, if they come from multiple systems, how they are combined;
- If you have a data warehouse, look there to find standardised definitions for key business entities. Now push them back to your source systems to perpetuate the ‘golden copy’;
- The road to MDM is cyclical. Begin with a few master-data entities and as many applications as possible. This is a better approach than identifying lots of entities but just a few applications if you want to perpetuate your data company-wide.
In each of the above cases, specific data is stored in an MDM environment to yield consistent results. Furthermore, MDM encompasses historical snapshots of data. This allows consistent analysis and more accurate projection of data into the future.
denotes premium content | Jul 20 2008 





