Data cleansing also called data scrubbing is the process of fixing incorrect data, clearing duplicate records, correcting incomplete data, etc. to ensure that the organization will have access to clean, accurate and usable data. This is a major step in data analysis as erroneous data can have catastrophic effects in the longer run. A simple example would be when Salespeople enter a proper noun in two different ways, like the name of a person is spelled differently using ‘y’ and ‘i’ by two different sales representatives. This leads to duplication of records and shows the sales figures as jacked up and falsifies revenue in the system.

