Skip to Main Content
questions, ask us

Getting Started with Data Cleaning and OpenRefine

This guide is meant to introduce readers to the importance of data cleaning through a useful tool for working with "messy" data, OpenRefine.

Starting a Project

Creating a Project

Upon launching, OpenRefine gives users the option to create, open, or import a project. To create a new project, select that option from the left-hand side of the screen. There are several ways to import new data, through

  1. Uploading a file from the computer
  2. Downloading data from a web address (URL)
  3. Pasting raw data into a plain text box
  4. Connecting to a database and importing the information, or
  5. Importing a public Google Spreadsheet through its URL

For the purposes of this guide, we will be demonstrating how to upload new data from a file local to your computer. The file demonstrated in this guide is a CSV (comma-separated values) file, however, OpenRefine can support many different file types. 

You will choose a file from your computer and then select the Next button.


Previewing Data

Once your data is uploaded, you will be taken to a Preview screen. In this screen, you will be given the option to parse your data in a variety of ways. Because we are working with a CSV file, the "CSV / TSV / separator-based files" option will be in bold.

There is another box where you can specify your character encoding. This is especially useful when working with different languages, particularly ones that contain special characters.

The boxes all the way to the bottom right allow you to further customize how OpenRefine reads-in your data. This is where you would be able to specify to the program if your columns of data have no headers (titles) or if there are any lines you would like the program to skip.

Before selecting Create Project at the upper right-hand side of the screen, make sure you name your project, so you'll be able to find it later. Once the project is named and you like how your data is being displayed, you will create your project and be taken to a new screen (pictured below).

© University of Nevada Las Vegas