What is data cleaning? Data cleaning is the process of making sure that data is usable, unique and correct. Data cleaning can refer to multiple different tasks from de-duplication to data appending. Here are some common data cleaning tasks that can all be described as data cleaning:

Data De-duplication – This refer to either merging or deleting records that are duplicated. This can be achieved using data deduplication tools.

Data Validation or Data Verification – validating data is often associated with data cleaning when dealing with customer or prospect databases. Data validation is the simple task of checking if the record is correct or not. Typically validation is done on email, address or phone number for consumer or business records. There is a difference between validating and verifying, validating refers to simply checking if the address exists vs. validating refers to checking if the person is associated with the address provided. Some companies use the terms interchangeable so double check prior to starting your data cleaning project.

Data Appending – Appending or enhancing of data is often referred to filling in missing information. For example, if the record has incomplete information you may acquire the missing data from third party providers or flag it for future data appending. Typically you would append information such as address, email, industry, title, and so on. If you want to add a new piece of data to your records this would be called data enhancement.

Data Standardization or Data Normalization – simply refers to formatting data by either grouping text fields into different groups or by formatting the data so that it matches. For example, you may have “New York” or “NY” found under the State field, which makes it hard for segmentation or analysis to pull all records from New York, so you would standardize it. You may also have issues with date formats where data coming from two different sources have different date field formats such as MM/DD/YYYY while another source can be DD-MM-YYYY making it hard to run trend analysis.

While, the above are the most common data cleaning initiatives, there maybe other tasks that are required to increase data quality. If you are looking to get your data cleansed, contact StrategicDB for a free data audit which will provide you with a full data cleaning strategy specifically aimed for your business.