Digital Lab

Recap: OpenRefine Workshop

In the DHNetwork’s first workshop, Tarje Sælen Lavik introduced the ins and outs of OpenRefine.

Photo of Tarje talking to a group of participants with laptops sitting around the table. The powerpoint in the background shows a series of portraits in a database
Linn Heidi Stokkedal

Main content

During this lunch event, a small group of people dedicated their time to learn how and why to use OpenRefine to clean and enhance their datasets. The workshop was led by Tarje Sælen Lavik, who is a principal librarian at UiB involved in using and developing platforms for employees and students.

OpenRefine is a versatile tool that can be useful for people from different disciplines using different research methods. This reflects one of the goals of the DHNetwork: we want connect researchers at UiB who use digital methods in their research, even if they do not consider their work to be strictly ‘Digital Humanities’.

As an example dataset for the workshop, we used Miriam Posner’s data on NJShipwrecks. Through a series of small operations, we learned how to clean the data using facets, filters, and clusters. But OpenRefine is not only a tool to clean data. Additionally, Tarje taught us how to reconcile a dataset with the WikiData database. By doing this you can enhance your own data with the publicly accessible data gathered by Wikimedia.

Resources and Next Steps:

If you are interested in OpenRefine, but could not make to the workshop, we highly recommend Tarje’s GitLab page for this workshop, which has instructions on how to use OpenRefine and examples from Posner’s dataset. If you have a GitLab account, you can also contribute to this workshop page by adding your own projects and datasets and use of OpenRefine to the tutorial. We are very interested in seeing what you are working on!

The workshop was a great start to a series of tutorials we plan to have as part of the DHNetwork about platforms that are open enough to be interesting for people across disciplines in various types of research projects. On 23 April 2019, we will have our next workshop, in which we will teach participants how to use popular network analysis tool Gephi.