The biggest data quality challenge faced by CRM systems is duplicates. Your CRM system will always have a few duplicate records unless you have an effective data deduplication strategy in place. Here are a few reasons for this:
- Humans may not perform a record search, and they keep adding new contacts.
- Data import tools may fail to point out duplicate records.
- Integration with external sources such as online registration sites may not verify CRM data before adding to it.
- Various types of software bugs within or outside the CRM can create thousands of duplicate records.
Having a 2 percent duplicate record isn’t bad if your tools can easily spot and correct them. Get more than 5 percent, and users will become worried. Moreover, systems having duplicates records of over 25 percent can be a threat to careers.
Detect, Correct Data Duplicates consumes Time
The correction cycle takes some time just like any data pollution cycle. You need to put the right correction plan, with the right expectation, schedule, and budget in place.
Begin by utilizing the best duplicate detection tool; test it to ensure it doesn’t produce such serious errors that your integrations with other external programs are blocked.
The next thing is to analyze the systemic sources of duplicates, narrow them down and get your team working to block the inflow of duplicate records. Repeat as necessary.
While this is on, conduct the heuristics of the said duplication. Don’t forget that each table will have a different pattern. Ensure that everyone in your team is familiar with the tools and processes since deduplication requires an iterative approach. “Lowest risk” seems to be the best candidates, with the fewest number of other records pointing to them.
When analyzing heuristics of duplicates, here are four things you need to understand:
- The most effective way of detecting a potential duplicate “pair;”
- The best way of spotting the winner in the merge cycle;
- What sections of the “loser” records needs to be kept, and
- The best way to deal with pointers to and from the loser record
These days, most data deduplication tools function while in your database. This means it doesn’t interrupt other processes. Even with this, deduping is still without its challenges. It can even be a significant pain as there’s no “undo” function, so extra precaution is a must.
CRM Data Deduplication needs Methodical Approach
This is what the data deduplication cycle looks like:
- Carry out a full system backup (yes, all the time.)
- If you can, do all of your work in the system sandbox first, then repeat the whole process in the production instance.
- Keep a log of each process so you can repeat it easily if anything goes wrong.
- Normalize all tables, especially for things like picklist values, country codes, as well as other items where clean field data will result in the improvement of the quality of duplicate detection.
- Identify areas where you want both the winners and losers to be preserved such as email, phone numbers, owner and record type. Make sure you complete every step in this stage before you proceed.
- Do a set up for your data deduplication tool, using the fields and matching criteria that have the greatest possible potential to spot duplicate records. Run the tool to see the completeness of the duplicate candidates.
- Get the merge rules as well as processing scenario that point out winners in every duplicate. Ensure that the created text field will be merged with “append mode” so as to preserve the loser’s data. Allow users to review the merge results for possible gotchas.
- With this set criteria carry out a test merge. Try to see if any unintended consequences come up. Once you are satisfied, run through the process in production. Repeat the process again and watch out for any side effects after a day.
- Backup the whole system again if everything goes according to plan in production.
High-Risk Data Will Cause the Most Dedupe Headaches
High-risk records typically have many things pointing to them. If that record is lost, other references are lost too. In many CRM systems, you will have the highest-risk table – and outside system usually, aren’t pointing to it. You may determine that there’s no possibility of merging the account. Then what?
With many CRM there’s the concept of a parent account. That can lead to a solution strategy when accounts can’t be merged. The new account serves as a parent to the duplicate accounts which perform the role of a “holding company” of sorts for roll-ups.
Duplicate records are no doubt the most common form of data corruption. Unfortunately, data deduplication is a process, not an event. Even if you manage to clean up and get out of the system right now, there will be system or software update that will lead to the creation of a new pattern of duplicates. Since cleaning them up will be easier if there’s only one error pattern, there will be a need for a complete duplicate detection cycle every month.
If you need to de-dupe your records in Salesforce or another CRM, feel free to contact us at 1-877-332-4923 or here. StrategicDB specializes in de-duping of accounts, contacts and/or leads.