Matching Models are based on algorithms powered by machine learning. When a new model is created it will need to be Trained, which is a process where DataGroomr will look for duplicates based on the fields you specified. But in order for Machine Learning models to work, a person must confirm (or reject) a small subset of the discovered duplicates. This is called Training and that process is described here.
Tip: A rule may be trained multiple times to improve accuracy.
When a user presses the TRAIN button, DataGroomr will analyze your existing data to identify duplicate (and non-duplicate) sets of records. The amount of time required is based on the amount of records in your Salesforce environment. You may exist this window and return at any time.
You will be shown sets of potentially duplicates records along with three options:
- YES - the records are duplicate
- NO - the records are not duplicates
- NOT SURE - if you cannot determine if the records are duplicates
We recommend identifying 5 sets of positive and 5 sets of negative duplicates for each field included in the model. For example, if your model consists of 4 fields then you should review at least 40 sets. When a sufficient number of duplicate sets is reviewed the FINISH button will become active. Pressing this button will generate a confirmation window with additional information.
Press CONFIRM button to activate the model.