Master data management
|This article needs additional citations for verification. (April 2012)|
In business, master data management (MDM) comprises the processes, governance, policies, standards and tools that consistently define and manage the critical data of an organization to provide a single point of reference.
The data that is mastered may include:
- reference data - the business objects for transactions, and the dimensions for analysis
- transactional data - supports applications
- analytical data - supports decision making 
In computing, an MDM tool can be used to support master data management by removing duplicates, standardizing data (mass maintaining), and incorporating rules to eliminate incorrect data from entering the system in order to create an authoritative source of master data. Master data are the products, accounts and parties for which the business transactions are completed. The root cause problem stems from business unit and product line segmentation, in which the same customer will be serviced by different product lines, with redundant data being entered about the customer (aka party in the role of customer) and account in order to process the transaction. The redundancy of party and account data is compounded in the front to back office life cycle, where the authoritative single source for the party, account and product data is needed but is often once again redundantly entered or augmented.
MDM has the objective of providing processes for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing such data throughout an organization to ensure consistency and control in the ongoing maintenance and application use of this information.
The term recalls the concept of a master file from an earlier computing era.
At a basic level, MDM seeks to ensure that an organization does not use multiple (potentially inconsistent) versions of the same master data in different parts of its operations, which can occur in large organizations. A common example of poor MDM is the scenario of a bank at which a customer has taken out a mortgage and the bank begins to send mortgage solicitations to that customer, ignoring the fact that the person already has a mortgage account relationship with the bank. This happens because the customer information used by the marketing section within the bank lacks integration with the customer information used by the customer services section of the bank. Thus the two groups remain unaware that an existing customer is also considered a sales lead. The process of record linkage is used to associate different records that correspond to the same entity, in this case the same person.
Other problems include (for example) issues with the quality of data, consistent classification and identification of data, and data-reconciliation issues. Master data management of disparate data systems requires data transformations as the data extracted from the disparate source data system is transformed and loaded into the master data management hub. To synchronize the disparate source master data, the managed master data extracted from the master data management hub is again transformed and loaded into the disparate source data system as the master data is updated. As with other Extract, Transform, Load-based data movement, these processes are expensive and inefficient to develop and to maintain which greatly reduces the return on investment for the master data management product.
One of the most common reasons some large corporations experience massive issues with MDM is growth through mergers or acquisitions. Two organizations which merge will typically create an entity with duplicate master data (since each likely had at least one master database of its own prior to the merger). Ideally, database administrators resolve this problem through deduplication of the master data as part of the merger. In practice, however, reconciling several master data systems can present difficulties because of the dependencies that existing applications have on the master databases. As a result, more often than not the two systems do not fully merge, but remain separate, with a special reconciliation process defined that ensures consistency between the data stored in the two systems. Over time, however, as further mergers and acquisitions occur, the problem multiplies, more and more master databases appear, and data-reconciliation processes become extremely complex, and consequently unmanageable and unreliable. Because of this trend, one can find organizations with 10, 15, or even as many as 100 separate, poorly integrated master databases, which can cause serious operational problems in the areas of customer satisfaction, operational efficiency, decision-support, and regulatory compliance.
Processes commonly seen in MDM include source identification, data collection, data transformation, normalization, rule administration, error detection and correction, data consolidation, data storage, data distribution, data classification, taxonomy services, item master creation, schema mapping, product codification, data enrichment and data governance
The selection of entities considered for MDM depends somewhat on the nature of an organization. In the common case of commercial enterprises, MDM may apply to such entities as customer (customer data integration), product (product information management), employee, and vendor. MDM processes identify the sources from which to collect descriptions of these entities. In the course of transformation and normalization, administrators adapt descriptions to conform to standard formats and data domains, making it possible to remove duplicate instances of any entity. Such processes generally result in an organizational MDM repository, from which all requests for a certain entity instance produce the same description, irrespective of the originating sources and the requesting destination.
The tools include data networks, file systems, a data warehouse, data marts, an operational data store, data mining, data analysis, data visualization, Data federation and data virtualization. One of the newest tools, virtual master data management utilizes data virtualization and a persistent metadata server to implement a multi-level automated MDM hierarchy.
Transmission of Master Data
There are several ways in which Master Data may be collated and distributed to other systems. This includes:
- Data consolidation: The process of capturing master data from multiple sources and integrating into a single hub (operational data store) for replication to other destination systems.
- Data federation: The process of providing a single virtual view of master data from one or more sources to one or more destination systems.
- Data propagation: The process of copying master data from one system to another, typically through point-to-point interfaces in legacy systems.
- Reference data
- Master data
- Record linkage
- Data steward
- Data visualization
- Customer data integration
- Data integration
- Product information management
- Identity resolution
- Enterprise information integration
- Linked data
- Semantic Web
- Data governance
- Operational data store
- Single customer view
- "What is Master Data" SearchDataManagement, TechTarget, 22 November 2010, http://searchdatamanagement.techtarget.com/definition/master-data-management
- "Introduction to Master Data Management", Mark Rittman, Director, Rittman Mead Consulting, 9 May 2008 https://s3.amazonaws.com/rmc_docs/Introduction%20to%20Oracle%20Master%20Data%20Management.pdf
- ""Defining Master Data", David Loshin, BeyeNetwork, May 2006
- "Creating the Golden Record: Better Data Through Chemistry", DAMA, slide 26, Donald J. Soulsby, 22 October 2009
- Master data management at DMOZ
- Microsoft: The What, Why, and How of Master Data Management
- Microsoft: Master Data Management (MDM) Hub Architecture
- PolarLake: Reference Data Management (RDM) and Governance
- Open Methodology for Master Data Management
- Semarchy: Why do I Need MDM? (Video)
- MDM Community
- Multi - Data Domain MDM