Every system has some bogus data that was inputted either by internal teams to test capabilities or prospects who did not wish to provide you with their personal data. There are a few ways to spot bogus data systematically which includes:
- Format does not match: for fields such as website or email you can see if it contains . and @ as well as having a minimum number of characters to be valid.
- Characters found in numeric fields: Spotting ‘abcsd-sfd’ in a phone field maybe a red flag, however, they may not be bogus as brands do have ‘1-800-clean-data’ as a phone number that is valid.
- Same character repeated more than 3 times: If your company name or first name is ‘aaaa’ or ‘bbbbbb’ you know they are bogus.
- Search for “Test“: Most people when they test their CRM, they input test to identify the record is bogus. Sometimes, you may find tens of records that are test.
- Long Characters: While spam blockers do block most spam form submits but not all. For example, if you find in your first name: ‘if you are looking for data cleansing firms’, chances are its bogus as even the most complex names would stop at about 50 characters.
- Mismatched Address/Phone #: If the state is New York, Zip code is California and phone # is in Chicago you know there is an issue. Keep in mind that now days, people do work remotely so be careful with the phone number matching address.
These are just some of the ways to identify bogus or dirty records. The best solution is to delete those records if you are certain they are bogus, while records that have some correctness to them for example deliverable email to delete the bogus fields and keep the parts that are real.
If you are looking for data cleaning services including identification of bogus records, look no further! StrategicDB is a one stop shop for all your data cleaning needs, find out why today!