Matching Models are algorithms used to detect duplicate records. DataGroomr provides two options for duplicate detection, machine learning based matching model and classic matching model.
Accessing Matching Models
To access matching models, select Settings under Dedupe on the Navigation Menu.

From the Settings page, click Matching and then select your desired object.

Creating a Matching Model
There are two types of Matching Models in DataGroomr. Machine Learning Models or Classic Matching Models
Press ADD MODEL button and select desired model type:

Classic Matching Model
- Rule-based approach: You manually define which fields to compare (e.g., First Name, Email, Company) and assign weights to each. 
- Deterministic logic: Matches are found based on exact or fuzzy logic using predefined thresholds. 
- Transparent scoring: You can clearly see which fields influenced a match and by how much. 
- Best for: Simple or predictable data scenarios where you want full control over match logic. 
Machine Learning Matching Model
- AI-assisted approach: Uses a machine learning algorithm trained on labeled examples of duplicates and non-duplicates. 
- Pattern recognition: Learns complex relationships between fields that may not be obvious or linear. 
- Adaptive: Gets better with more training data (you label examples, the model adjusts). 
- Confidence scores: Matches are scored based on statistical confidence, not rigid rules. 
- Best for: Large or messy datasets where manual rules may miss patterns or create false positives. 
Selecting either model option will open a dialog that includes the following elements:

- Name - enter a unique name for your model
- Fields - select all fields that should be included as part of this model.
- Field Sets - contains a list of Salesforce's configured matching rules and sets of fields that DataGroomr recommends based on commonly matched fields for that object for you.
- Assist - DataGroomr AI will select the best fields to be used for deduplication based on the best practices and the statistics of the data population.
Press Save button to create a model.
Basic Fields Configuration
The prefilled matching type is auto selected by DataGroomr based on the type of field.
The following comparison types are available for Classic Models:
- Exact - matches if values are exactly the same
- Similar - matches if values are similar
The following comparison types are available for ML models:
- Text - compares text values. Default comparison type;
- Short Text - compares short text values, faster than text, good examples to use it are City names and Zip Codes;
- Long Text - compares long text values like Description, preselected for TextArea field types;
- Name - compares person or company names;
- List - compares values in a list, preselected for Picklists;
- Date/Time - compares values as date and time, preselected for Date and DateTime field types;
- Number - compares numbers, preselected for price and number field types;
- Exact - checks if values are exactly the same;
Note: All data is cleaned up before comparison for all types of fields
- text fields are converted to lowercase and all special characters are removed
- phone numbers cleaned up to contain only digits
- emails are converted to lowercase
- websites are normalized to exclude protocol and leading www
In addition to the settings mentioned above, classic models allow to specify Field Weights. Clicking on the percentage values will allow you to specify the importance of field similarity matches between specified fields. The higher the percentage controlled by the slider, the greater the influence of matches between record fields will have against the match confidence score.
Note: In a Classic model, records are grouped when the sum of the weights for all matched fields meets or exceeds the dataset’s minimum confidence threshold. By raising or lowering individual field weights (and/or the threshold), you can make matching behave more like a broad OR (looser) or a strict AND (tighter) condition. For example,Using the example model above will generate groups where records
- matching on all fields will have 100% match confidence;
- matching on Full Name and Business Phone will have 70% match confidence;
- matching on Full Name, Phone and Email will have 90% match confidence;
- matching on Full Name and Email will have 60% match confidence;
- and so forth as long as group match confidence is more than the minimum confidence selected in a dataset.
Advanced Fields Configuration (Gear icon)

1. Blank Values
Matching behavior on blank values can be customized to allow blank values to be considered as matches. Users can specify whether or not to match records if both fields or either field is blank or if blank values should be disregarded entirely from the process. Match confidence values are impacted by this setting.
2. Group Name
Fields can be assigned to a Group, which tells DataGroomr to treat them as interchangeable when looking for duplicates. Instead of comparing each field in isolation, the engine will compare all fields within the same group against one another. This is especially useful when the same type of data might be stored in different places — for example, the same phone number could appear in multiple phone fields (work phone, home phone, business phone). By giving those fields the same Group Name, DataGroomr can cross-compare them and detect duplicates even if the data is entered in different fields.
Good to know: Most commonly used groups for Phone and Email fields are pre-created.
3. Required Match
When enabled, this setting enforces a rule in the matching model that requires records to match on a specific field in order to be considered duplicates. In other words, even if other fields indicate similarity, records will not be flagged as duplicates unless they also match on the required field.
4. Transform
Apply custom transformation to a field before comparing. It can be used to clean or format or extract data from fields — such as extracting domain names from emails, suffixes and so forth.
5. Synonyms
When selected, words contained within a dictionary list are considered to be the same word. A common example would be the contact name Robert which might be alternately be entered as Rob, Bob or Robbie.
6. Ignore Words
When selected, words contained within a dictionary list are ignored, therefore field value similarities between two records being compared are ignored. Ie. Corporation, Corp, Incorporated or Inc.
Add additional words into your list of synonyms and ignore words by Supervisr: Dictionaries
7. First N Characters
First N Characters setting allows users to specify a defined number of first characters in a field value instead of the entire text. This feature might be used to compare only area codes within a phone number field or the prefix numbers within a zip code.
Machine Learning Matching Model
Machine Learning Models are based on algorithms powered by machine learning. Before ML model can be used in a dataset it will need to be Trained. Click Train button to start training.
There are two options available for training machine learning models
- Train by AI Assistant (recommended) autonomous training and fine-tuning the machine learning model by AI. If you choose this option, once AI has profiled and trained ML model, you may apply directly to specified dataset.
- Train Manually is a process where you tell DataGroomr which records are duplicates and which are not, in order to teach the algorithm to identify patterns in data based on the fields you specify in the Set Up process.
Tip: A model may be trained multiple times to improve accuracy.
Learn more: Training machine learning model
Editing an Existing Model
To Edit an existing model, select it and then press the Open button.

Classic matching model can be edited at any time. Machine learning model can be edited only while it's in a Draft state. If it's Trained then it can be cloned, and another version of the model can be retrained.
Learn more: Training machine learning model
Cloning a Model
Occasionally you may need to modify or retrain an existing matching model. For example, you may need to remove or add a field. However, an existing model cannot be changed this way, but you can create a copy that can be edited.
To do this, select the rule and then press the CLONE button.

Change Model Type When Cloning
You can now choose the model type during the cloning process.
When cloning, select whether the new model should be a Machine Learning (ML) model or a Classic model.
- All existing field mappings, weights, and confidence thresholds are preserved. 
- The new model will adopt the logic of the selected type. 
This feature allows you to test the same dataset across both model types without recreating your configuration from scratch.

Deleting Model
A model can be deleted by selecting rule and then pressing the Trash button.

Assigning to Datasets
Classic matching models and trained machine learning models can be assigned to datasets or can be designated as default model for your organization.
From the Matching Models feature, select a model press the Assign button. Then choose the datasets to apply and press Assign button.
Good to Know: Alternatively, the same can be done using the Dataset Configuration window.
To set a model as a default, select the rule and then press the SET DEFAULT button. The green 'Default' label will be displayed next to that model.

 
                 

