Data cleaning is the act of finding (and correcting) inaccurate data within a given element (such as within records, projects, databases, spreadsheets, etc.). The process of cleaning data can be accomplished in a number of ways, either through scripting or through the use of specific tools (such as OpenRefine).
While this guide will teach you how to navigate and use different elements of OpenRefine, many of the techniques mentioned (i.e. transforming data, dealing with duplicates, adding or removing columns) are important to data cleaning as a practice. Therefore, many of the skills and techniques discussed throughout this tutorial are transferable and can be used in other software and tools outside of OpenRefine.
Why clean your data?
Cleaning your data ultimately improves your data's quality and enables more accurate analysis.