Data Management Plans
A data management plan or DMP is a formal document that outlines how data are to be collected and handled both during a research project, and after the project is completed
WHY do you have to write a Data Management Plan?
Funders such as The Research Council of Norway and the European Union require that a data management plan be submitted within 6 months from receipt of support. In addition, the UiB policy for Open Science states that "All research projects lead by researchers at UiB will have a data management plan".
A data management plan (DMP) is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyze, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data.
The main purpose of a data management plan is to ensure that the research data is handled in a proper and safe manner throughout the project, and prepare them for preservation in the future. The data management plan is also useful for the researcher as it:
- makes it possible to identify at an early stage significant problems to be resolved (such as obtaining consent or taking consideration to copyright)
- identify ahead of time any additional costs or resources needed to manage the data (such as additional storage capacity, etc.)
- helps to plan the need for data management ahead of time and to monitor data activities throughout the lifetime of the project.
- helps others use your data if shared
As a researcher you should start writing a data management plan in the early phase of the project. The data management plan is a living document which should be revised and updated as research progresses.
WHAT should a data management include?
The Research Council of Norway describes the data management plan as a living document that follows the research project. It specifies the kind of data that will be generated, how the data will be described, wwhere the data will be stored and whether and how the data can be shared.
The content of a data management plan will vary between fields of study. Science Europe's Practical Guide to the International Alignment of Research Data Management describes the following core requirements:
1. Data description and collection or re-use of existing data
- How will new data be collected or produced and/or how will existing data be re-used?- What data (for example the kinds, formats, and volumes) will be collected or produced?
2. Documentation and data quality
- What metadata and documentation (for example the methodology of data collection and way of organising data) will accompany data?- What data quality control measures will be used?
3. Storage and backup during the research process
- How will data and metadata be stored and backed up during the research process?- How will data security and protection of sensitive data be taken care of during the research?
4. Legal and ethical requirements, codes of conduct
- If personal data are processed, how will compliance with legislation on personal data and on data security be ensured?- How will other legal issues, such as intellectual property rights and ownership, be managed?- What legislation is applicable?- How will possible ethical issues be taken into account, and codes of conduct followed?
5. Data sharing and long-term preservation
- How and when will data be shared?- Are there possible restrictions to data sharing or embargo reasons?- How will data for preservation be selected, and where will data be preserved long-term (for example a data repository or archive)?- What methods or software tools will be needed to access and use the data?- How will the application of a unique and persistent identifier (such as a Digital Object Identifier (DOI)) to each data set be ensured?
6. Data management responsibilities and resources
- Who (for example role, position, and institution) will be responsible for data management (i.e. the data steward)?- What resources (for example financial and time) will be dedicated to data management and ensuring that data will be FAIR (Findable, Accessible, Interoperable, Re-usable)?
HOW do you write a Data Management Plan?
There are several DMP Tools, that can help you write a Data Management Plan. Most of these follow the Science Europe guidelines, and are compliant with the requirements of NFR and Horizon 2020. The plans can easily be exported and shared. Examples of DMP tools are:
- DMPonline (NB! when creating an account, choose "other" organisation)
- Argos OpenAIRE
- EasyDMP from Sigma2 (adapted to Norwegian requirements)
- Data Steward Wizard from Elixir (highly recommended for Life Sciences; own template adapted to Norwegian requirements)
- Norsk senter for forskningsdata (NSD)
The Research Council strongly encourages you to share your DMP in an open repository. You can do this by exporting a PDF of your plan from the DMP Tool you have used, and publish it in Zenodo.
Examples of DMPs
In addition there are many public DMPs in Zenodo, DMPonline or DMPtool, and we encourage you to look for DMPs within your field. However, be aware that these planes have not been curated or reviewed, and should not be used as an example to follow.
FAQ - Frequently asked questions
What are research data?
In general, research data can be defined as all data which are created by researchers in the course of their work.
Information, in particular facts or numbers, collected or created to develop claims made in the academic literature, e.g. statistics, results of experiments, measurements, observations resulting from fieldwork, survey results, interview recordings, images etc.
What are the FAIR principles?
- Findable: Finding datasets is the first step of (re)using datasets. This requires metadata ("data about data"), readable for humans and machines, and persistent identifiers.
- Accessible: Criteria to ensure access to a found dataset. Metadata must be retrievable by a standardized protocol, that allows for authorization procedures were necessary. Importantly, metadata must remain accessible even if the data itself are no longer available.
- Interoperable: Prerequisites to integrate datasets with other data and applications. Metadata should use ontologies/controlled vocabularies and include qualified references to other (meta)data.
- Reusable: FAIR should allow for reuse of data. Metadata must be rich, must contain provenance, and follow community standards. A data usage license ensures legal interoperability.
Do I need to share all my research data from my research project?
No, you need to go through a selection process in order to make a decision on which data do you need to keep; which data do you wish to keep; which data shouldn’t be kept or isn’t worth keeping; what are the retention periods from my funder, the university, and any legal or regulatory requirements.
Where no specific guidance is available, we recommend researchers keep in mind two things when deciding which data to share:
- What data are necessary to reproduce or validate your results? Note that this may include code.
- What data have the potential for reuse by others?
The Digital Curation Centre (DCC) has some useful guidance ‘Five steps to decide what data to keep’.
Are there any "Best Practices" for managing research data?
The following are some of the components necessary for good data management practices.
- Use descriptive and informative file names
- Choose file formats that will ensure long-term access
- Track different versions of your documents
- Create metadata for every experiment or analysis you run.
See the Cessda Data Management Expert Guide or DataONE Best Practices Primer for more useful tips!
What is metadata?
Metadata is structured information that describes, explains, locates, and makes it easier to retrieve and use an information resource.
In order to help make your data reusable and accessible to you and others in the future, you need to create and archive accurate metadata along with your data. Discipline-specific examples of metadata are provided by the Digital Curation Centre, and can be found here: http://www.dcc.ac.uk/resources/metadata-standards
Video (2 min): What are metadata (and why are they so important)?
What formats are best for preserving files in the long term?
Commonly used formats such as those produced by Microsoft Office products (e.g. Word documents or Excel spreadsheets) are very likely to have reasonable longevity, but be aware that they are proprietary (owned by someone) and so will not necessarily exist forever or remain easily readable.
Instead you should use open, non-proprietary formats – for example, txt. rather than Microsoft Word, CSV rather than Excel, TIFF rather than Photoshop files, or as XML rather than a database. For more examples, see
However, open formats may not support all the functionality found within a proprietary format, or they might result in larger files because they offer less efficient compression of files. Sometimes, you will want to store your data in its original format and also in a more open or accessible format for sharing, archiving, or future use. For more information on preferred file formats, see here.
What are 'non-proprietary' or 'open' formats, and why should I use them?
A non-properitary format is a format which does not have restrictions on its use and over which no one claims intellectual property rights. For example, Microsoft Office products, such as Microsoft Word, are proprietary, while Open Office products are non-proprietary (and open source).
For long term access to files, digital preservation experts tend to recommend 'non-proprietary' and 'open' formats. The logic here is that if the code behind the software is publically available (i.e. open source), then that format/software will be supported so long as at least one competent tinkerer still finds it interesting or useful. In contrast, a private software company can go out of business or stop producing a compatible version of the software in whose format your data was saved, and no one will have the rights or knowledge to provide it anymore.
Where can I archive and share my research data?
To make your research data visible and findable, you should choose a subject specific repository. The webpage re3data.org is the largest and most comprehensive registry of data repositories available on the web, another curated registry is fairsharing.org.
If there is no subject specific repository, researchers at UiB can archive data in UiB Open Research Data. If you have sensitive data, you can archive your data at NSD.
Please see also our page on Open Access to Research Data.
How do I manage personal or sensitive data?
Personal and sensitive research data need special caution. You should never publish data that identify individual people, and you must treat such data with extra care when collecting, transferring or storing them for research purposes.
Projects that collects personal data must consider if the project should notify NSD. Read more. The IT-department at UiB has a service for secure storage and access to sensitive data during the research project, SAFE (Information in Norwegian).
Unlike other research data, personal or sensitive data can not always be shared in an open archive. The saying is "as open as possible, as restricted as necessary". However, if properly de-identified or anonymized and with suitable consent forms, you may share a processed version of such data, either in an open repository or in a closed repository (such as NSD). Read more about sensitive data and anonymization here.
Please see also the information on our page on Open Access to Research Data.