Match Tuning
Match tuning refines how potential duplicates are identified and handled as data and business requirements evolve.
Because matching directly affects golden records and downstream processes, match tuning is designed as a governed collaboration with clear separation between:
-
Super user - Maintain the technical configuration of the matching algorithm
-
Data owner - Define business decisions about match behavior
Match Tuning Purpose and Approaches
Match tuning is the process of refining a matching algorithm to produce the desired outcome across different data scenarios. Because data quality and data patterns vary, defining match criteria is rarely a one-time task. Instead, effective match tuning typically requires both a solid understanding of the data and collaboration between data owners (who understand the business meaning of the data) and super users (who maintain and operate the matching configuration).
At the core of match tuning is the evaluation and refinement of match criteria — the rules that determine when two objects should be treated as duplicates and what action to take. In STEP, you can tune match criteria using two main approaches, depending on the source of the tuning data.
Tuning on Imported Golden Record Data (Matching Rulesets)
This approach uses Matching Rulesets and is performed post-import, based on golden records already present in the system. It combines record pair analysis (including rules and pair export with statistics) with decision-making in a dedicated Match tuning UI as described in the topic Match Tuning UI.
When a ruleset is activated, the matching algorithm uses the ruleset decisions as the primary match criteria. Any rules defined directly on the matching algorithm (the rules table) are then used only as fallback when a record pair falls into a combination with no explicit decision.
For details, refer to the topic Match Tuning using Matching Rulesets.
Important: Matching Rulesets are supported only for matching algorithms using embedded match codes and are supported exclusively for match action Merge Golden Record.
Note: Access to the Matching Rulesets and the Match tuning UI is provided through an early adopter program enabled by Stibo Systems. Participation requires a request to simplifymatching@stibosystems.com.
Tuning on Sample Data (Match Tuning Configuration)
This approach uses a Match Tuning Configuration and is performed pre-import, based on a sample dataset before it enters STEP. A Match Tuning Configuration allows data stewards to analyze sample data and iteratively adjust the matching algorithm before running an import.
When using this approach, the match criteria rules must be defined in the rules table on the matching algorithm. Match criteria are not defined through ruleset decisions in this approach.
A Match Tuning Configuration can evaluate the matching algorithm against sample import data and generate reports such as Pair Export and Match Codes Export. Match tuning is an iterative process of adjusting match codes, match criteria, and matcher thresholds and then re-evaluating until the algorithm produces acceptable match results and performance for the incoming dataset.
This approach requires a one-time setup to create the match tuning setup structure and enable the creation of Match Tuning Configuration objects. For details, refer to the topic Match Tuning using Match Tuning Configuration.
Use the following steps to configure and use your tuning solution:
-
Perform the setup for match tuning, as defined in the topic Initial Setup for Match Tuning.
-
Create a match tuning configuration, as defined in the topic Configuring Match Tuning.
Note: Match tuning using Match Tuning Configuration only works in a Match and Merge solution, which is for entities.
Additional Tools that support Match Tuning
Effective match tuning begins with understanding your data. Before tuning match behavior, it is strongly recommended to use:
-
Data Profiling to analyze completeness and value patterns and to identify data quality issues that may be better addressed upstream. For more information, refer to the topic Data Profiling.
-
The Match Result tab in conjunction with the Duplicates tabs, which use standard data profiling tools to identify data entries that are appropriate for matching records. For more information, refer to the topics Matching Algorithm - Match Result Tab and Matching Algorithm - Duplicates Tabs.
These tools coexist: when a ruleset is activated, ruleset decisions are primary, while the matching algorithm’s rules table can act as fallback when no ruleset decision exists.