Working with columns and sorting

Overview

Teaching: 5 min
Exercises: 5 min
Questions
  • How do I move, rename or remove columns in OpenRefine?

  • How do I sort data in OpenRefine?

Objectives
  • Explain how to reorder, rename and remove columns

  • Explain how to sort data in columns

Reordering columns

You can re-order the columns by clicking the drop-down menu at the top of the first column (labelled ‘All’), and choosing Edit columns->Re-order / remove columns ….

You can then drag and drop column names to re-order the columns, or remove columns completely if they are not required.

Renaming columns

You can rename a column by opening the drop-down menu at the top of the column that you would like to rename, and choosing ‘Edit column’ > ‘Rename this column’. You will then be prompted to enter the new column name.

Sorting data

You can sort data in OpenRefine by clicking on the drop-down menu for the column you want to sort on, and choosing Sort.

Once you have sorted the data, a new Sort drop-down menu will be displayed.

Unlike in Excel, ‘Sorts’ in OpenRefine are temporary - that is, if you remove the Sort, the data will go back to its original ‘unordered’ state. The ‘Sort’ drop-down menu lets you amend the existing sort (e.g., reverse the sort order), remove existing sorts, and/or make sorts permanent.

You can sort on multiple columns at the same time by adding another sorted column (in the same way).

Organizing for more cleanup

Let’s get rid of a column we don’t need (‘Archiving information URL’), push the remaining two URL columns to the right, and re-arrange some other columns. This will help us work a little faster tomorrow.

Aim for this order:

  • Journal title
  • Alternative title
  • Keywords
  • Subjects
  • Added on Date
  • Most Recent Article Added

Solution

All the functions you need are in the ‘Edit column’ menu The Edit columns menu

Key Points

  • You can reorder, rename and remove columns in OpenRefine

  • Sorting in OpenRefine always sorts all rows

  • The original order of rows in OpenRefine is maintained during a sort until you use the option to Reorder Rows Permanently