Matching, Linking, and Merging
The STEP Matching, Linking, and Merging component offers powerful functionality for identifying and handling duplicate product, entity, asset, and classification objects in STEP.
The matching, linking, and merging functionality is most commonly used for:
- Cleanup operations, such as during data migration
- Matching of the same product from multiple suppliers
- Matching of the same customer from different source systems
- Consolidation of information from different systems
- Cleansing data after migrating records from various sources
Getting Started
Before configuring the matching, linking, and merging functionality, users must define what qualifies two or more objects as duplicates and what the system should do when it encounters such duplicates.
- The match criteria defines what qualifies objects as duplicates
- The match action determines what the system should do with such duplicates
Both the match criteria and the match action are included as part of a matching algorithm.
Choice of Match Action
When setting up a matching solution, users must choose a match action. The match action defines the workflow and the data model around the objects you are matching.
Users can configure the system to only identify duplicates using the Identify Duplicates match action, or to also take action on those matches. The system supports different action strategies like merging records or generating new link golden records. For more information, see the Identify Duplicates Match Action topic of this documentation here.
Match and Link
Match and Link creates and maintains a set of Golden Records as an aggregation of matching Source Records through an asynchronous process.
-
In Product MDM, Match and Link automates the creation and maintenance of Sell-Side Products as Golden Records, based on Buy-Side Products as Source Records.
-
In Customer MDM, Match and Link resolves Household Entities as Golden Records from Individual Customer Entities as Source Records.
Match and Link uses an event processor to create and update new Link Golden Records that captures the best information from each of the Source Records. The system identifies the new Link Golden Record object with a STEP identifier and links this record to all source records contributing to it. Over time, new information may clarify that some source records that were linked together are no longer valid for linking to a specific Golden Record. The algorithm will then link these Source Records to different Link Golden Records. As a result of this automatic linking and splitting, the STEP identifier of the Link Golden Record linked to a given Source Record may change over time.
Users should never edit a Link Golden Record object directly. To edit a Link Golden Record object, users should add the information on a special type of source record, called a 'Silver Record,' and the information is then merged into the Link Golden Record by the matching algorithm. The promotion of information from the Silver Record to the Link Golden Record happens asynchronously through the Matching Event Processor. For more information, see the Match and Link topic in this documentation here.
Match and Merge
The Match and Merge solution uses criteria to match entity records and merge these incoming records into Golden Records.
In Customer MDM and Supplier MDM, Match and Merge is used to consolidate, enrich, and synchronize duplicate records in surrounding systems.
Note: The Match and Merge solution only works for Entities.
Match and Merge works by combining a special importer and an event processor. When the Match and Merge Importer imports a new entity, the importer uses a matching algorithm to compare the incoming entity against an existing Golden Record. If a matching entity already exists, the system promotes the information from the incoming entity to that existing Golden Record through Survivorship Rules. As Golden Records are updated, a Matching Event Processor identifies matching Golden Records and merges information from one of the records into the other and deactivates the non-survivor. For more information, see Match and Merge topic in this documentation here.
Match Tuning
Defining a match criteria that accurately identifies matching records is an iterative process that requires a thorough understanding of the data and collaboration between data owners and the super users defining the match criteria.
During the implementation process, you will tune the match criteria to match the correct records. You may also need to optimize your match criteria to achieve your performance goals.
For more information about the tools available and the recommended process, see the Tuning and Monitoring a Matching Algorithm topic in this documentation here.