DATA MANAGEMENT

Data Management Plans

A data management plan or DMP is a formal document that outlines how data are to be collected and handled both during a research project, and after the project is completed

Main content

Photo:

CESSDA ERIC

The main purpose of a data management plan is to ensure that the research data is handled in a proper and safe manner throughout the project, and prepare them for preservation in the future. The data management plan is also useful for the researcher as it:

makes it possible to identify at an early stage significant problems to be resolved (such as obtaining consent or taking consideration to copyright)
identify ahead of time any additional costs or resources needed to manage the data (such as additional storage capacity, etc.)
helps to plan the need for data management ahead of time and to monitor data activities throughout the lifetime of the project.
helps others use your data if shared

WHY do you have to write a Data Management Plan?

Funders such as The Research Council of Norway and Horizon Europe require that a data management plan be submitted within 6 months from receipt of support. In addition, the UiB policy for Open Science states that "All research projects lead by researchers at UiB will have a data management plan".

A data management plan (DMP) is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyze, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data.

As a researcher you should start writing a data management plan in the early phase of the project. The data management plan is a living document which should be revised and updated as research progresses.

Funder requirements:

DMP 50 seconds

Producer:

University of Bath Research Data Centre

University of Bath

WHAT should a Data Management Plan include?

The Research Council of Norway describes the data management plan as a living document that follows the research project. It specifies the kind of data that will be generated, how the data will be described, wwhere the data will be stored and whether and how the data can be shared.

The content of a data management plan will vary between fields of study. Science Europe's Practical Guide to the International Alignment of Research Data Management describes the following core requirements:

1. Data description and collection or re-use of existing data

- How will new data be collected or produced and/or how will existing data be re-used?- What data (for example the kinds, formats, and volumes) will be collected or produced?

2. Documentation and data quality

- What metadata and documentation (for example the methodology of data collection and way of organising data) will accompany data?- What data quality control measures will be used?

3. Storage and backup during the research process

- How will data and metadata be stored and backed up during the research process?- How will data security and protection of sensitive data be taken care of during the research?

4. Legal and ethical requirements, codes of conduct

- If personal data are processed, how will compliance with legislation on personal data and on data security be ensured?- How will other legal issues, such as intellectual property rights and ownership, be managed?- What legislation is applicable?- How will possible ethical issues be taken into account, and codes of conduct followed?

5. Data sharing and long-term preservation

- How and when will data be shared?- Are there possible restrictions to data sharing or embargo reasons?- How will data for preservation be selected, and where will data be preserved long-term (for example a data repository or archive)?- What methods or software tools will be needed to access and use the data?- How will the application of a unique and persistent identifier (such as a Digital Object Identifier (DOI)) to each data set be ensured?

6. Data management responsibilities and resources

- Who (for example role, position, and institution) will be responsible for data management (i.e. the data steward)?- What resources (for example financial and time) will be dedicated to data management and ensuring that data will be FAIR (Findable, Accessible, Interoperable, Re-usable)?

See DMP_checklist and Cessda Data Management expert guide for more information.

HOW do you write a Data Management Plan?

There are several DMP Tools, that can help you write a Data Management Plan. Most of these follow the Science Europe guidelines, and are compliant with the requirements of NFR and the European Comission. The plans can easily be exported and shared.

Recommended DMP tools are:

DMPonline (NB! Find Science Europe template under tab Reference > Funder Requirements)
Data Steward Wizard from Elixir (recommended for Life Sciences; own template adapted to Norwegian requirements, can export machine-actionable DMP)

Further options:

EasyDMP from Sigma2/Sikt (adapted to Norwegian requirements, can export machine-actionable DMP)
DMP from Sikt
argos from OpenAire

When creating your DMP using one of these DMP Tools, we recommend reading Cessda Data Management expert guide, ELIXIR's RDMkit, Science Europe guidelines, DCC's checklist.

The Research Council strongly encourages you to share your DMP in an open repository. You can e.g. do this by exporting your plan from the DMP Tool you have used, and publish it in Zenodo.

Examples of DMPs

Example DMPs and guidance from Digital Curation Centre
Curated collection of Horizon 2020 DMPs from Universitety i Wien
DMP Catalogue from LIBER Europe

In addition there are many public DMPs in Zenodo, DMPonline or DMPtool, and we encourage you to look for DMPs within your field. However, be aware that these planes have not been curated or reviewed, and should not be used as an example to follow.

FAQ - Frequently asked questions

What are research data?

In general, research data can be defined as all data which are created by researchers in the course of their work.

Information, in particular facts or numbers, collected or created to develop claims made in the academic literature, e.g. statistics, results of experiments, measurements, observations resulting from fieldwork, survey results, interview recordings, images etc.

What are the FAIR principles?

The FAIR-principles were published in 2016 (Wilkinson et al.) as a detailed set of guidelines to ensure that archived research data is of sufficient quality. The FAIR-principles in brief:

Findable: Finding datasets is the first step of (re)using datasets. This requires metadata ("data about data"), readable for humans and machines, and persistent identifiers.
Accessible: Criteria to ensure access to a found dataset. Metadata must be retrievable by a standardized protocol, that allows for authorization procedures were necessary. Importantly, metadata must remain accessible even if the data itself are no longer available.
Interoperable: Prerequisites to integrate datasets with other data and applications. Metadata should use ontologies/controlled vocabularies and include qualified references to other (meta)data.
Reusable: FAIR should allow for reuse of data. Metadata must be rich, must contain provenance, and follow community standards. A data usage license ensures legal interoperability.

Do I need to share all my research data from my research project?

No, you need to go through a selection process in order to make a decision on which data do you need to keep; which data do you wish to keep; which data shouldn’t be kept or isn’t worth keeping; what are the retention periods from my funder, the university, and any legal or regulatory requirements.

Where no specific guidance is available, we recommend researchers keep in mind two things when deciding which data to share:
- What data are necessary to reproduce or validate your results? Note that this may include code.
- What data have the potential for reuse by others?

The Digital Curation Centre (DCC) has some useful guidance ‘Five steps to decide what data to keep’.

Are there any "Best Practices" for managing research data?

The following are some of the components necessary for good data management practices.

- Use descriptive and informative file names
- Choose file formats that will ensure long-term access
- Track different versions of your documents
- Create metadata for every experiment or analysis you run.

See the Cessda Data Management Expert Guide or DataONE Best Practices Primer for more useful tips!

What is metadata?

Metadata is structured information that describes, explains, locates, and makes it easier to retrieve and use an information resource.

In order to help make your data reusable and accessible to you and others in the future, you need to create and archive accurate metadata along with your data. The Digital Curation Centre (DCC) provides an overview of discipline-specific metadata.

What formats are best for preserving files in the long term?

Commonly used formats such as those produced by Microsoft Office products (e.g. Word documents or Excel spreadsheets) are very likely to have reasonable longevity, but be aware that they are proprietary (owned by someone) and so will not necessarily exist forever or remain easily readable.

Instead you should use open, non-proprietary formats – for example, txt. rather than Microsoft Word, CSV rather than Excel, TIFF rather than Photoshop files, or as XML rather than a database. For more examples, see

However, open formats may not support all the functionality found within a proprietary format, or they might result in larger files because they offer less efficient compression of files. Sometimes, you will want to store your data in its original format and also in a more open or accessible format for sharing, archiving, or future use. The Dutch national centre of expertise and repository for research data Data Archiving and Networked Services (DANS) provides more examples of preferred file formats.

What are 'non-proprietary' or 'open' formats, and why should I use them?

A non-properitary format is a format which does not have restrictions on its use and over which no one claims intellectual property rights. For example, Microsoft Office products, such as Microsoft Word, are proprietary, while Open Office products are non-proprietary (and open source).

For long term access to files, digital preservation experts tend to recommend 'non-proprietary' and 'open' formats. The logic here is that if the code behind the software is publically available (i.e. open source), then that format/software will be supported so long as at least one competent tinkerer still finds it interesting or useful. In contrast, a private software company can go out of business or stop producing a compatible version of the software in whose format your data was saved, and no one will have the rights or knowledge to provide it anymore.

Where can I archive and share my research data?

To make your research data visible and findable, you should choose a subject specific repository. The webpage re3data.org is the largest and most comprehensive registry of data repositories available on the web, another curated registry is fairsharing.org.

If there is no subject specific repository, researchers at UiB can archive data in DataverseNO.

Please see also our page on Open Access to Research Data.

How do I manage personal or sensitive data?

Personal and sensitive research data need special caution. You should never publish data that identify individual people, and you must treat such data with extra care when collecting, transferring or storing them for research purposes.

Unlike other research data, personal or sensitive data can not always be shared in an open archive. The saying is "as open as possible, as restricted as necessary". However, if properly de-identified or anonymized and with suitable consent forms, you may share a processed version of such data, either in an open repository or in a closed repository.

Please see also the information on our page on Open Access to Research Data.

Contact & guidance

In addition, we offer courses on request. Please get in touch if you wish a course specifically for your institute or research group!

Norwegian Open Science event calender with interesting events at other institutions.

Presentations

Presentation slides from past courses are available on Zenodo.

The what, why and how of data management planning

Producer:

Research Data Netherlands

Useful resources

Science Europe_Practical Guide to the International Alignment of Research Data Management

FAIR-Aware Additional guidance to the Science Europe DMP assessment rubric

DCC checklist for Data Management Plan

Cessda Data Management expert guide - Plan

Elixir RDMkit - Data Management Plan

Tilknyttet innhold

The what, why and how of data management planning

Open Access to Research Data

The University Library offers guidance on various aspects of research data handling and data management planning.

The University of Bergen Policy for Open Science

Openness, transparency, and knowledge exchange are core values for the University of Bergen. Technological changes and increased digitalization have created new opportunities for research, education, innovation, and artistic research.

19.02.2024