Illustration with collage of pictograms of clouds, pie chart, graph pictograms

A data management plan (DMP) is a document which defines how data handled throughout the lifecycle of a project—that is, from its acquisition to archival.

While these documents are typically used for research projects to meet funder requirements, they can be leveraged within a corporate environment as well to create structure and alignment between stakeholders.

Since DMPs highlight the types of data that will be used within the project and addresses the management of it throughout the  data lifecycle , stakeholders, such as governance teams, can provide clear feedback on the storage and dissemination of sensitive data, such as personally identifiable information (PII), at the onset of a project. These documents allow teams to avoid compliance and regulatory pitfalls, and they can serve as templates on how to approach and manage data for future projects.

Learn the building blocks and best practices to help your teams accelerate responsible AI.

Register for the white paper on AI governance

A data management plan typically has five components:

1. A statement of purpose  2. Data definitions   3. Data collection and access  4. Frequently asked questions (FAQs)   5. Research data limitations 

Each of these focus areas enables research agencies and research funders (or perhaps your data management team) to assess the amount of risk associated with a given project. The data management plan also addresses how to manage that risk. For example, if sensitive data is used within a project, is it appropriate to re-use that data for future projects? Depending on the sensitivity of that data, it may not be appropriate, or it may require additional user consent.   

Each component of a data management plan focuses on a particular piece of information, we’ll delve more into each one.

1. Statement of purpose:  This explains why the team needs to acquire specific types of data over the course of the project. It should clearly outline the question that the team is attempting to answer with this dataset.

2. Data definitions:  Data descriptions help end users and their audiences understand naming conventions and their correspondence with specific datasets. Some of this information may also be held within the metadata, typically labeling data by its data sources and file formats. Creating and abiding by pre-defined metadata standards throughout the data acquisition process will also ensure a more consistent collection and smoother integration process.

3. Data collection and access:  This section of a DMP highlights how data will be collected, stored, and accessed from a data repository. It will likely address the data source of any existing data or the approach that will be taken to create new data, such as an experiment. It should also contain information around the timing of data—i.e. how often it will be updated and over what period of time. The type and timing of the data will generally inform its storage and access to third-parties. For example, unstructured data will require a  non-relational  system versus a  relational  one, and larger datasets will require more compute power compared to smaller ones. There also may be restrictions around data sharing due to privacy or intellectual property rights. Since project stakeholders will expect that sensitive data, such as personally identifiable information (PII), is treated with the upmost care and security, it’s important for data owners to be clear about their data management practices, particularly in this area. This will include answers to questions around the data’s long-term preservation, such as data archiving or data re-use. For data that is not sensitive in nature, there will be an expectation to provide a pathway for third parties to access raw data and research results.

4. Frequently Asked Questions:  This section can be considered a “catch-all” for other popular questions within data management projects, such as sharing plans, citation preferences, and data backup methods. Researchers or data owners may to highlight any digital object identifiers (DOI) for owners of adjacent or related projects. Additionally, if project owners are archiving data, they’ll also need to address the length of the archive’s existence. Will it live for one year, five years, or perhaps indefinitely?

5. Research data limitations:  This section addresses upfront limitations with the dataset, which will limit its ability to generalize more broadly to populations. For example, data may be focused on a specific demographic, such as a geography, gender, race, age group, et cetera.

Data management plans are predominantly used in more academic settings, particularly for federal government funded programs, such as the National Institutes of Health (NIH) and National Science Foundation (NSF), but corporations can also leverage them in either their research or data governance functions. While academics and researchers need to comply with funder requirements in grant applications, many research institutions create a DMP tool to provide participants with the relevant template for their research project. Data governance teams within organizations can set up similar protocols to ingest data requests from stakeholders advocating for new data initiatives.

Grant applications

Researchers in both private and public sectors look to different funding agencies to sponsor research and innovation initiatives. DMPs mitigate risk for both parties, ensuring that data owners have assessed the value as well as their own personal responsibility (i.e. security and disaster recovery measures) to research data management.

Data governance initiatives Data management plans are also incredibly helpful for new data initiatives in business settings, assisting all stakeholders in understanding the importance of new data sources and how it can tie to business outcomes. As developments within  hybrid cloud ,  artificial intelligence , the internet of things (IoT), and  edge computing  continue to spur the growth of big data, enterprises will need to find ways to manage the complexity of it within their data systems.

Read the free report to learn how data management on a unified platform for data, analytics and AI can accelerate time to insights.

Learn the best practices to ensure data quality, accessibility, and security as a foundation to an AI-centric data architecture.

Scale AI workloads for all your data, anywhere, with IBM watsonx.data, a fit-for-purpose data store built on an open data lakehouse architecture.

MIT Libraries logo MIT Libraries

Data management

Write a data management plan

A data management plan (DMP) will help you manage your data, meet funder requirements, and help others use your data if shared.

what is a data management plan in research

Alternatively, you can use the questions below and any specific data management requirements from your funding agency to write your data management plan. Additional resources for creating plans  are also provided below.

  • What’s the purpose of the research?
  • What is the data? How and in what format will the data be collected? Is it numerical data, image data, text sequences, or modeling data?
  • How much data will be generated for this research?
  • How long will the data be collected and how often will it change?
  • Are you using data that someone else produced? If so, where is it from?
  • Who is responsible for managing the data? Who will ensure that the data management plan is carried out?
  • What documentation will you be creating in order to make the data understandable by other researchers?
  • Are you using metadata that is standard to your field? How will the metadata be managed and stored?
  • What file formats will be used? Do these formats conform to an open standard and/or are they proprietary?
  • Are you using a file format that is standard to your field? If not, how will you document the alternative you are using?
  • What directory and file naming convention will be used?
  • What are your local storage and backup procedures ? Will this data require secure storage?
  • What tools or software are required to read or view the data?
  • Who has the right to manage this data? Is it the responsibility of the PI, student, lab, MIT, or funding agency?
  • What data will be shared , when, and how?
  • Does sharing the data raise privacy, ethical, or confidentiality concerns ?  Do you have a plan to protect or anonymize data, if needed?
  • Who holds intellectual property rights for the data and other information created by the project? Will any copyrighted or licensed material be used? Do you have permission to use/disseminate this material?
  • Are there any patent- or technology-licensing-related restrictions on data sharing associated with this grant? The Technology Licensing Office (TLO) can provide this information.
  • Will this research be published in a journal that requires the underlying data to accompany articles?
  • Will there be any embargoes on the data?
  • Will you permit re-use , redistribution, or the creation of new tools, services, data sets, or products (derivatives)? Will commercial use be allowed?
  • How will you be archiving the data? Will you be storing it in an archive or repository for long-term access? If not, how will you preserve access to the data?
  • Is a discipline-specific repository available? If not, consider depositing your data into a generalist data repository . Email us at [email protected] if you’re interested in discussing repository options for your data.
  • How will you prepare data for preservation or data sharing? Will the data need to be anonymized or converted to more stable file formats?
  • Are software or tools needed to use the data? Will these be archived?
  • How long should the data be retained? 3-5 years, 10 years, or forever?

Additional resources for creating plans

  • Managing your data – Project Start & End Checklists (MIT Data Management Services) : Checklist (PDF) with detailed resources to help researchers set up and maintain robust data management practices for the full life of a project.
  • ezDMP : a free web-based tool for creating DMPs specific to a subset of NSF funding requirements.
  • Guidelines for Effective Data Management Plans and Data Management Plan Resources and Examples (ICPSR) : Framework for creating a plan and links to examples of data management plans in various scientific disciplines
  • Example Plans (University of Minnesota)
  • NSF (by the DART project) : assessment rubric and guidance
  • NIH (by FASEB)

See other guides to data management for additional guidance on managing data and select information related to particular formats or disciplines.

Research Data Management

  • Data Management Plans

What are Data Management Plans?

A Data Management Plan outlines how data will be collected, organized, stored, secured, shared, and preserved in a research project. It covers data collection methods, organization, storage, sharing, preservation, ethics, and researcher responsibilities. Data Management Plans promote transparency and maximize research impact by ensuring your data can be used effectively, by you, your collaborators, and future generations of researchers. They can be a powerful tool for thinking in advance about collaborative research workflows and can help forecast financial costs associated with data so they can be written into budgets and funded.

Data Management Plans are increasingly required by federal grant funding agencies, such as the National Institutes of Health (NIH) , National Science Foundation (NSF) , and National Endowment for the Humanities (NEH) . Different funders have different policies, so it is important to look at the requirements of the granting agency to learn about their Data Policies and Compliance .

What is included in a Data Management Plan?

Data management plans are brief (2-3 page) documents that outline in advance how you will manage your data throughout the life of your project. They often include:

  • How the data will be collected
  • The type or format of data collected
  • The size of the data
  • How the data will be described (i.e., will you be using codebooks, logs, specific metadata standards, ontologies, etc.)
  • Where the data will be stored, backed up and secured if necessary
  • How the data will be analyzed
  • How the data will be shared and preserved, or reasons not to do so, including who will have permissions to use the data

What tools can help me write a Data Management Plan?

The DMPTool from the California Digital Library is an online tool for creating data management plans. It has templates and resources to guide you through the process of creating a data management plan that is in compliance with funder requirements.

  • << Previous: Plan & Design
  • Next: Data Policies & Compliance >>
  • Data Policies & Compliance
  • Directory Structures
  • File Naming Conventions
  • Roles & Responsibilities
  • Collaborative Tools & Software
  • Electronic Lab Notebooks
  • Documentation & Metadata
  • Reproducibility
  • Analysis Ready Datasets
  • Image Management
  • Version Control
  • Data Storage
  • Data & Safety Monitoring
  • Data Privacy & Confidentiality
  • Retention & Preservation
  • Data Destruction
  • Data Sharing
  • Public Access
  • Data Transfer Agreements
  • Intellectual Property & Copyright
  • Unique Identifiers
  • Data Repositories

Research Support

Request a data management services consultation.

Email  [email protected] to schedule a consultation related to the organization, storage, preservation, and sharing of data.

  • Last Updated: May 23, 2024 10:37 AM
  • URL: https://guides.library.ucdavis.edu/data-management

Ask Yale Library

My Library Accounts

Find, Request, and Use

Help and Research Support

Visit and Study

Explore Collections

Research Data Management: Plan for Data

  • Plan for Data
  • Organize & Document Data
  • Store & Secure Data
  • Validate Data
  • Share & Re-use Data
  • Data Use Agreements
  • Research Data Policies

What is a Data Management Plan?

Data management plans (DMPs) are documents that outline how data will be collected , stored , secured , analyzed , disseminated , and preserved over the lifecycle of a research project. They are typically created in the early stages of a project, and they are typically short documents that may evolve over time. Increasingly, they are required by funders and institutions alike, and they are a recommended best practice in research data management.

Tab through this guide to consider each stage of the research data management process, and each correlated section of a data management plan.

Tools for Data Management Planning

DMPTool is a collaborative effort between several universities to streamline the data management planning process.

The DMPTool supports the majority of federal and many non-profit and private funding agencies that require data management plans as part of a grant proposal application. ( View the list of supported organizations and corresponding templates.) If the funder you're applying to isn't listed or you just want to create one as good practice, there is an option for a generic plan.

Key features:

Data management plan templates from most major funders

Guided creation of a data management plan with click-throughs and helpful questions and examples

Access to public plans , to review ahead of creating your own

Ability to share plans with collaborators as well as copy and reuse existing plans

How to get started:

Log in with your yale.edu email to be directed to a NetID sign-in, and review the quick start guide .

Research Data Lifecycle

image

Additional Resources for Data Management Planning

  • << Previous: Overview
  • Next: Organize & Document Data >>
  • Last Updated: Sep 27, 2023 1:15 PM
  • URL: https://guides.library.yale.edu/datamanagement

Yale Library logo

Site Navigation

P.O. BOX 208240 New Haven, CT 06250-8240 (203) 432-1775

Yale's Libraries

Bass Library

Beinecke Rare Book and Manuscript Library

Classics Library

Cushing/Whitney Medical Library

Divinity Library

East Asia Library

Gilmore Music Library

Haas Family Arts Library

Lewis Walpole Library

Lillian Goldman Law Library

Marx Science and Social Science Library

Sterling Memorial Library

Yale Center for British Art

SUBSCRIBE TO OUR NEWSLETTER

@YALELIBRARY

image of the ceiling of sterling memorial library

Yale Library Instagram

Accessibility       Diversity, Equity, and Inclusion      Giving       Privacy and Data Use      Contact Our Web Team    

© 2022 Yale University Library • All Rights Reserved

Writing data management plans

  • Requirements and resources
  • DMP service

Profile Photo

What is a DMP/DMSP?

Patrons at Maker Bar in Terman Engineering Library, photo by Micaela Go/Stanford Libraries

A data management plan (DMP) or data management and sharing plan (DMSP) is a written document that describes:

  • the data you expect to acquire or generate during the course of a research project,
  • how you will manage, describe, analyze, and store those data, and
  • what mechanisms you will use at the end of your project to share and preserve your data.

You may have already considered some or all of these issues with regard to your research project, but writing them down helps you:

  • formalize the process,
  • identify areas of your plan that need improvement,
  • provide you with a record of what you intend(ed) to do and an easy reference during the project,
  • make it easier for everyone in your research group to understand their roles and the data management processes that will be used for the research project.

A DMP is a living document

Seagulls on rock just offshore of Hopkins Marine Station; photo by Micaela Go/Stanford Libraries

Research is all about discovery, and the process of doing research sometimes requires you to shift gears and revise your intended path.

Your DMP is a living document that you may need to alter as the course of your research changes. Remember that any time your research plans change, you should review your DMP to make sure that it still meets your needs.

Data management is best addressed in the early stages of a research project, but it is never too late to develop a data management plan.

  • Next: Requirements and resources >>
  • Last Updated: Jul 7, 2023 8:37 AM
  • URL: https://guides.library.stanford.edu/dmps

University of Cambridge

Study at Cambridge

About the university, research at cambridge.

  • Undergraduate courses
  • Events and open days
  • Fees and finance
  • Postgraduate courses
  • How to apply
  • Postgraduate events
  • Fees and funding
  • International students
  • Continuing education
  • Executive and professional education
  • Courses in education
  • How the University and Colleges work
  • Term dates and calendars
  • Visiting the University
  • Annual reports
  • Equality and diversity
  • A global university
  • Public engagement
  • Give to Cambridge
  • For Cambridge students
  • For our researchers
  • Business and enterprise
  • Colleges & departments
  • Email & phone search
  • Museums & collections
  • Data Management Guide
  • Creating your data

what is a data management plan in research

  • Data Management Guide overview
  • Creating your data overview
  • Organising your data
  • Accessing your data overview
  • Looking after your data
  • Sharing your data
  • Choosing a software licence
  • Electronic Research Notebooks overview
  • Support overview
  • Resources and support at Cambridge overview
  • Data Management Plan Support Service
  • DMP Pilot for PhDs
  • External support
  • Data Repository overview
  • Upload your data
  • Depositor's checklist
  • Guidance on the data submission process
  • Data Policies overview
  • University of Cambridge Research Data Management Policy Framework overview
  • Cambridge data-related policies
  • Funders Policies overview
  • News overview
  • Data Champions overview
  • Data Champion list
  • Data Champion Community
  • Data Champions Cartoons
  • Alumni Data Champions
  • 2024 Call for Data Champions
  • Events overview
  • Love Data Week 2024
  • Past Events
  • Contact Us overview
  • Our Governance
  • Request a Meeting
  • Data Management Plan
  • Research Data Management
  • Choosing Formats
  • Intellectual Property Rights
  • Data Protection and Ethics
  • Accessing your data
  • Electronic Research Notebooks

what is a data management plan in research

Crafting your data management plan

Most  research funders encourage researchers to think about their research data management activities from the beginning of the project. This will often mean a formal plan for managing data (a 'data management plan').

However, even informally setting out your plans and project guidelines can make your life much easier. If you want to be able to reuse your data or manage collaboration with colleagues, it helps to plan for that from the beginning. Decisions you make about which software to use, how to organise, store and manage your data, and the consent agreements you would have to negotiate, will all affect what is possible to do, and what data is shareable in the future.

Planning ahead for your data management needs and activities will help ensure that:

  • you have adequate technological resources (e.g. storage space, support staff time)
  • your data will be robust and free from versioning errors and gaps in documentation
  • your data is backed up and safe from sudden loss or corruption
  • you can meet legal and ethical requirements
  • you are able to share your finalised data publicly, if you and/or your funder desires
  • your data will remain accessible and comprehensible in the near, middle, and distant future.

What do research funders expect?

Most funders expect you to prepare a data management plan when applying for a research grant. Additionally, some funders, for example the Medical Research Council ( MRC ), will require you to regularly review your data management plan and make all necessary amendments while managing your grant. The Economic and Social Research Council (ESRC) provides comprehensive guidelines on how treating personal and sensitive data, as well as on obtaining consent for data collection from participants. The information on funder requirements is available here .

Where do I start?

Much of research data management is simply good research practice so you will already be some way down the line. Data plans are just a way of ensuring (and/or showing) that you have thought about how to create, store, backup, share and preserve your data. 

The Digital Curation Centre ( DCC ) has produced an interactive online tool to help researchers create data management plans: DMPOnline . The website records all major UK/European funder requirements, and it automatically tailors the data management plan template to the needs of your funder. You can log in to DMPonline using your Raven account (to do this, simply select the University of Cambridge as your institution, and you will be re-directed to the Raven log-in interface). Data plans that you create are easily exportable to a desired file type (Word, Excel, pdf), so you can simply add them to your grant applications.

What should I include in my data plan?

The best way to start is to look for what your funder expects you to cover in your Data Management Plan. You can either check this on your funder's website or by using the DMPonline tool, which is populated with funder's template and will guide you through your funder's requirements.

Who can help with data planning at the University of Cambridge?

The University has a range of support staff who can help you create a data management plan, including:

  • your departmental or college IT staff
  • subject and departmental librarians
  • your funder - some funders, for example, the Economic and Social Research Council (ESRC), offer support in preparation of data management plans

No matter who you ask for support, please get in touch early, so there is enough time for support staff to help. 

Simple data management plan template

Have a look at our simple data management plan template here - if your funder does not provide guidance on data plans, this might be a good starting point.

Related links

  • DMPonline - tool to create data management plans
  • ESRC - support for data management plans

About this site :

This site is managed by the Research Data Team.

If you have any questions about this site, please e-mail us directly

The project is a joint initiative of Cambridge University Library and the Research Operations Office .

Privacy policy

© 2024 University of Cambridge

  • Contact the University
  • Accessibility
  • Freedom of information
  • Privacy policy and cookies
  • Statement on Modern Slavery
  • Terms and conditions
  • University A-Z
  • Undergraduate
  • Postgraduate
  • Research news
  • About research at Cambridge
  • Spotlight on...

Banner

Research Data Management: Data Management Plan

  • What is Research Data Management
  • What is Research Data
  • Research Data Cycle
  • Funder Requirements
  • RDM Policy & Legislation
  • Library Support for Research Data Management
  • RDM Training
  • Data Sharing & Publishing
  • Data Citing
  • Research Data & Research Data Management
  • Data Storage

About Data Management Plans

What is Data Management Plan (DMP)

A data management plan (DMP) is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyze, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data.

You may have already considered some or all of these issues with regard to your research project, but writing them down helps you formalize the process, identify weaknesses in your plan, and provide you with a record of what you intend(ed) to do. Data management is best addressed in the early stages of a research project, but it is never too late to develop a data management plan.

Creating a Data Management Plan

Research is all about discovery, and the process of doing research sometimes requires you to shift gears and revise your intended path. Your DMP is a living document that you may need to alter as the course of your research changes. Remember that any time your research plans change, you should review your DMP to make sure that it still meets your needs.

What should be covered in the Data Plan

The framework below, adapted from one developed by the   Inter-University Consortium for Political and Social Research (ICPSR) , shows one approach to the elements of a data management plan.

DMPTool @NWU

Online tool for creating a dmp @nwu.

North-West University Libraries provides access to the online  Data Management Planning (DMP) Tool . The DMPTool includes data management plan templates, along with a wealth of information and assistance to guide you through the process of creating a ready-to-use DMP for your specific research project and funding agency. 

Enter DMP Tool Button

We can review your data management plan and make suggestions. We are also happy to verify whether your intended use of the  Dayta Ya Rona Digital Repository  as described in your plan matches up with the Dayta Ya Rona services we provide.

DMP submission

Once your data management plan is complete, you will include it with the rest of your proposal to the funding agency. North-West University Research's Office has further information on proposal development and submission. 

  • << Previous: RDM Policy & Legislation
  • Next: Library Support for Research Data Management >>
  • Last Updated: May 23, 2024 8:46 AM
  • URL: https://libguides.nwu.ac.za/research-data-management

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • PLoS Comput Biol
  • v.11(10); 2015 Oct

Logo of ploscomp

Ten Simple Rules for Creating a Good Data Management Plan

William k. michener.

College of University Libraries & Learning Sciences, University of New Mexico, Albuquerque, New Mexico, United States of America

Introduction

Research papers and data products are key outcomes of the science enterprise. Governmental, nongovernmental, and private foundation sponsors of research are increasingly recognizing the value of research data. As a result, most funders now require that sufficiently detailed data management plans be submitted as part of a research proposal. A data management plan (DMP) is a document that describes how you will treat your data during a project and what happens with the data after the project ends. Such plans typically cover all or portions of the data life cycle—from data discovery, collection, and organization (e.g., spreadsheets, databases), through quality assurance/quality control, documentation (e.g., data types, laboratory methods) and use of the data, to data preservation and sharing with others (e.g., data policies and dissemination approaches). Fig 1 illustrates the relationship between hypothetical research and data life cycles and highlights the links to the rules presented in this paper. The DMP undergoes peer review and is used in part to evaluate a project’s merit. Plans also document the data management activities associated with funded projects and may be revisited during performance reviews.

An external file that holds a picture, illustration, etc.
Object name is pcbi.1004525.g001.jpg

As part of the research life cycle (A), many researchers (1) test ideas and hypotheses by (2) acquiring data that are (3) incorporated into various analyses and visualizations, leading to interpretations that are then (4) published in the literature and disseminated via other mechanisms (e.g., conference presentations, blogs, tweets), and that often lead back to (1) new ideas and hypotheses. During the data life cycle (B), researchers typically (1) develop a plan for how data will be managed during and after the project; (2) discover and acquire existing data and (3) collect and organize new data; (4) assure the quality of the data; (5) describe the data (i.e., ascribe metadata); (6) use the data in analyses, models, visualizations, etc.; and (7) preserve and (8) share the data with others (e.g., researchers, students, decision makers), possibly leading to new ideas and hypotheses.

Earlier articles in the Ten Simple Rules series of PLOS Computational Biology provided guidance on getting grants [ 1 ], writing research papers [ 2 ], presenting research findings [ 3 ], and caring for scientific data [ 4 ]. Here, I present ten simple rules that can help guide the process of creating an effective plan for managing research data—the basis for the project’s findings, research papers, and data products. I focus on the principles and practices that will result in a DMP that can be easily understood by others and put to use by your research team. Moreover, following the ten simple rules will help ensure that your data are safe and sharable and that your project maximizes the funder’s return on investment.

Rule 1: Determine the Research Sponsor Requirements

Research communities typically develop their own standard methods and approaches for managing and disseminating data. Likewise, research sponsors often have very specific DMP expectations. For instance, the Wellcome Trust, the Gordon and Betty Moore Foundation (GBMF), the United States National Institutes of Health (NIH), and the US National Science Foundation (NSF) all fund computational biology research but differ markedly in their DMP requirements. The GBMF, for instance, requires that potential grantees develop a comprehensive DMP in conjunction with their program officer that answers dozens of specific questions. In contrast, NIH requirements are much less detailed and primarily ask that potential grantees explain how data will be shared or provide reasons as to why the data cannot be shared. Furthermore, a single research sponsor (such as the NSF) may have different requirements that are established for individual divisions and programs within the organization. Note that plan requirements may not be labeled as such; for example, the National Institutes of Health guidelines focus largely on data sharing and are found in a document entitled “NIH Data Sharing Policy and Implementation Guidance” ( http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm ).

Significant time and effort can be saved by first understanding the requirements set forth by the organization to which you are submitting a proposal. Research sponsors normally provide DMP requirements in either the public request for proposals (RFP) or in an online grant proposal guide. The DMPTool ( https://dmptool.org/ ) and DMPonline ( https://dmponline.dcc.ac.uk/ ) websites are also extremely valuable resources that provide updated funding agency plan requirements (for the US and United Kingdom, respectively) in the form of templates that are usually accompanied with annotated advice for filling in the template. The DMPTool website also includes numerous example plans that have been published by DMPTool users. Such examples provide an indication of the depth and breadth of detail that are normally included in a plan and often lead to new ideas that can be incorporated in your plan.

Regardless of whether you have previously submitted proposals to a particular funding program, it is always important to check the latest RFP, as well as the research sponsor’s website, to verify whether requirements have recently changed and how. Furthermore, don’t hesitate to contact the responsible program officer(s) that are listed in a specific solicitation to discuss sponsor requirements or to address specific questions that arise as you are creating a DMP for your proposed project. Keep in mind that the principle objective should be to create a plan that will be useful for your project. Thus, good data management plans can and often do contain more information than is minimally required by the research sponsor. Note, though, that some sponsors constrain the length of DMPs (e.g., two-page limit); in such cases, a synopsis of your more comprehensive plan can be provided, and it may be permissible to include an appendix, supplementary file, or link.

Rule 2: Identify the Data to Be Collected

Every component of the DMP depends upon knowing how much and what types of data will be collected. Data volume is clearly important, as it normally costs more in terms of infrastructure and personnel time to manage 10 terabytes of data than 10 megabytes. But, other characteristics of the data also affect costs as well as metadata, data quality assurance and preservation strategies, and even data policies. A good plan will include information that is sufficient to understand the nature of the data that is collected, including:

  • Types. A good first step is to list the various types of data that you expect to collect or create. This may include text, spreadsheets, software and algorithms, models, images and movies, audio files, and patient records. Note that many research sponsors define data broadly to include physical collections, software and code, and curriculum materials.
  • Sources. Data may come from direct human observation, laboratory and field instruments, experiments, simulations, and compilations of data from other studies. Reviewers and sponsors may be particularly interested in understanding if data are proprietary, are being compiled from other studies, pertain to human subjects, or are otherwise subject to restrictions in their use or redistribution.
  • Volume. Both the total volume of data and the total number of files that are expected to be collected can affect all other data management activities.
  • Data and file formats. Technology changes and formats that are acceptable today may soon be obsolete. Good choices include those formats that are nonproprietary, based upon open standards, and widely adopted and preferred by the scientific community (e.g., Comma Separated Values [CSV] over Excel [.xls, xlsx]). Data are more likely to be accessible for the long term if they are uncompressed, unencrypted, and stored using standard character encodings such as UTF-16.

The precise types, sources, volume, and formats of data may not be known beforehand, depending on the nature and uniqueness of the research. In such case, the solution is to iteratively update the plan (see Rule 9 ).

Rule 3: Define How the Data Will Be Organized

Once there is an understanding of the volume and types of data to be collected, a next obvious step is to define how the data will be organized and managed. For many projects, a small number of data tables will be generated that can be effectively managed with commercial or open source spreadsheet programs like Excel and OpenOffice Calc. Larger data volumes and usage constraints may require the use of relational database management systems (RDBMS) for linked data tables like ORACLE or mySQL, or a Geographic Information System (GIS) for geospatial data layers like ArcGIS, GRASS, or QGIS.

The details about how the data will be organized and managed could fill many pages of text and, in fact, should be recorded as the project evolves. However, in drafting a DMP, it is most helpful to initially focus on the types and, possibly, names of the products that will be used. The software tools that are employed in a project should be amenable to the anticipated tasks. A spreadsheet program, for example, would be insufficient for a project in which terabytes of data are expected to be generated, and a sophisticated RDMBS may be overkill for a project in which only a few small data tables will be created. Furthermore, projects dependent upon a GIS or RDBMS may entail considerable software costs and design and programming effort that should be planned and budgeted for upfront (see Rules 9 and 10 ). Depending on sponsor requirements and space constraints, it may also be useful to specify conventions for file naming, persistent unique identifiers (e.g., Digital Object Identifiers [DOIs]), and versioning control (for both software and data products).

Rule 4: Explain How the Data Will Be Documented

Rows and columns of numbers and characters have little to no meaning unless they are documented in some fashion. Metadata—the details about what, where, when, why, and how the data were collected, processed, and interpreted—provide the information that enables data and files to be discovered, used, and properly cited. Metadata include descriptions of how data and files are named, physically structured, and stored as well as details about the experiments, analytical methods, and research context. It is generally the case that the utility and longevity of data relate directly to how complete and comprehensive the metadata are. The amount of effort devoted to creating comprehensive metadata may vary substantially based on the complexity, types, and volume of data.

A sound documentation strategy can be based on three steps. First, identify the types of information that should be captured to enable a researcher like you to discover, access, interpret, use, and cite your data. Second, determine whether there is a community-based metadata schema or standard (i.e., preferred sets of metadata elements) that can be adopted. As examples, variations of the Dublin Core Metadata Initiative Abstract Model are used for many types of data and other resources, ISO (International Organization for Standardization) 19115 is used for geospatial data, ISA-Tab file format is used for experimental metadata, and Ecological Metadata Language (EML) is used for many types of environmental data. In many cases, a specific metadata content standard will be recommended by a target data repository, archive, or domain professional organization. Third, identify software tools that can be employed to create and manage metadata content (e.g., Metavist, Morpho). In lieu of existing tools, text files (e.g., readme.txt) that include the relevant metadata can be included as headers to the data files.

A best practice is to assign a responsible person to maintain an electronic lab notebook, in which all project details are maintained. The notebook should ideally be routinely reviewed and revised by another team member, as well as duplicated (see Rules 6 and 9 ). The metadata recorded in the notebook provide the basis for the metadata that will be associated with data products that are to be stored, reused, and shared.

Rule 5: Describe How Data Quality Will Be Assured

Quality assurance and quality control (QA/QC) refer to the processes that are employed to measure, assess, and improve the quality of products (e.g., data, software, etc.). It may be necessary to follow specific QA/QC guidelines depending on the nature of a study and research sponsorship; such requirements, if they exist, are normally stated in the RFP. Regardless, it is good practice to describe the QA/QC measures that you plan to employ in your project. Such measures may encompass training activities, instrument calibration and verification tests, double-blind data entry, and statistical and visualization approaches to error detection. Simple graphical data exploration approaches (e.g., scatterplots, mapping) can be invaluable for detecting anomalies and errors.

Rule 6: Present a Sound Data Storage and Preservation Strategy

A common mistake of inexperienced (and even many experienced) researchers is to assume that their personal computer and website will live forever. They fail to routinely duplicate their data during the course of the project and do not see the benefit of archiving data in a secure location for the long term. Inevitably, though, papers get lost, hard disks crash, URLs break, and tapes and other media degrade, with the result that the data become unavailable for use by both the originators and others. Thus, data storage and preservation are central to any good data management plan. Give careful consideration to three questions:

  • How long will the data be accessible?
  • How will data be stored and protected over the duration of the project?
  • How will data be preserved and made available for future use?

The answer to the first question depends on several factors. First, determine whether the research sponsor or your home institution have any specific requirements. Usually, all data do not need to be retained, and those that do need not be retained forever. Second, consider the intrinsic value of the data. Observations of phenomena that cannot be repeated (e.g., astronomical and environmental events) may need to be stored indefinitely. Data from easily repeatable experiments may only need to be stored for a short period. Simulations may only need to have the source code, initial conditions, and verification data stored. In addition to explaining how data will be selected for short-term storage and long-term preservation, remember to also highlight your plans for the accompanying metadata and related code and algorithms that will allow others to interpret and use the data (see Rule 4 ).

Develop a sound plan for storing and protecting data over the life of the project. A good approach is to store at least three copies in at least two geographically distributed locations (e.g., original location such as a desktop computer, an external hard drive, and one or more remote sites) and to adopt a regular schedule for duplicating the data (i.e., backup). Remote locations may include an offsite collaborator’s laboratory, an institutional repository (e.g., your departmental, university, or organization’s repository if located in a different building), or a commercial service, such as those offered by Amazon, Dropbox, Google, and Microsoft. The backup schedule should also include testing to ensure that stored data files can be retrieved.

Your best bet for being able to access the data 20 years beyond the life of the project will likely require a more robust solution (i.e., question 3 above). Seek advice from colleagues and librarians to identify an appropriate data repository for your research domain. Many disciplines maintain specific repositories such as GenBank for nucleotide sequence data and the Protein Data Bank for protein sequences. Likewise, many universities and organizations also host institutional repositories, and there are numerous general science data repositories such as Dryad ( http://datadryad.org/ ), figshare ( http://figshare.com/ ), and Zenodo ( http://zenodo.org/ ). Alternatively, one can easily search for discipline-specific and general-use repositories via online catalogs such as http://www.re3data.org/ (i.e., REgistry of REsearch data REpositories) and http://www.biosharing.org (i.e., BioSharing). It is often considered good practice to deposit code in a host repository like GitHub that specializes in source code management as well as some types of data like large files and tabular data (see https://github.com/ ). Make note of any repository-specific policies (e.g., data privacy and security, requirements to submit associated code) and costs for data submission, curation, and backup that should be included in the DMP and the proposal budget.

Rule 7: Define the Project’s Data Policies

Despite what may be a natural proclivity to avoid policy and legal matters, researchers cannot afford to do so when it comes to data. Research sponsors, institutions that host research, and scientists all have a role in and obligation for promoting responsible and ethical behavior. Consequently, many research sponsors require that DMPs include explicit policy statements about how data will be managed and shared. Such policies include:

  • licensing or sharing arrangements that pertain to the use of preexisting materials;
  • plans for retaining, licensing, sharing, and embargoing (i.e., limiting use by others for a period of time) data, code, and other materials; and
  • legal and ethical restrictions on access and use of human subject and other sensitive data.

Unfortunately, policies and laws often appear or are, in fact, confusing or contradictory. Furthermore, policies that apply within a single organization or in a given country may not apply elsewhere. When in doubt, consult your institution’s office of sponsored research, the relevant Institutional Review Board, or the program officer(s) assigned to the program to which you are applying for support.

Despite these caveats, it is usually possible to develop a sound policy by following a few simple steps. First, if preexisting materials, such as data and code, are being used, identify and include a description of the relevant licensing and sharing arrangements in your DMP. Explain how third party software or libraries are used in the creation and release of new software. Note that proprietary and intellectual property rights (IPR) laws and export control regulations may limit the extent to which code and software can be shared.

Second, explain how and when the data and other research products will be made available. Be sure to explain any embargo periods or delays such as publication or patent reasons. A common practice is to make data broadly available at the time of publication, or in the case of graduate students, at the time the graduate degree is awarded. Whenever possible, apply standard rights waivers or licenses, such as those established by Open Data Commons (ODC) and Creative Commons (CC), that guide subsequent use of data and other intellectual products (see http://creativecommons.org/ and http://opendatacommons.org/licenses/pddl/summary/ ). The CC0 license and the ODC Public Domain Dedication and License, for example, promote unrestricted sharing and data use. Nonstandard licenses and waivers can be a significant barrier to reuse.

Third, explain how human subject and other sensitive data will be treated (e.g., see http://privacyruleandresearch.nih.gov/ for information pertaining to human health research regulations set forth in the US Health Insurance Portability and Accountability Act). Many research sponsors require that investigators engaged in human subject research approaches seek or receive prior approval from the appropriate Institutional Review Board before a grant proposal is submitted and, certainly, receive approval before the actual research is undertaken. Approvals may require that informed consent be granted, that data are anonymized, or that use is restricted in some fashion.

Rule 8: Describe How the Data Will Be Disseminated

The best-laid preservation plans and data sharing policies do not necessarily mean that a project’s data will see the light of day. Reviewers and research sponsors will be reassured that this will not be the case if you have spelled out how and when the data products will be disseminated to others, especially people outside your research group. There are passive and active ways to disseminate data. Passive approaches include posting data on a project or personal website or mailing or emailing data upon request, although the latter can be problematic when dealing with large data and bandwidth constraints. More active, robust, and preferred approaches include: (1) publishing the data in an open repository or archive (see Rule 6 ); (2) submitting the data (or subsets thereof) as appendices or supplements to journal articles, such as is commonly done with the PLOS family of journals; and (3) publishing the data, metadata, and relevant code as a “data paper” [ 5 ]. Data papers can be published in various journals, including Scientific Data (from Nature Publishing Group), the GeoScience Data Journal (a Wiley publication on behalf of the Royal Meteorological Society), and GigaScience (a joint BioMed Central and Springer publication that supports big data from many biology and life science disciplines).

A good dissemination plan includes a few concise statements. State when, how, and what data products will be made available. Generally, making data available to the greatest extent and with the fewest possible restrictions at the time of publication or project completion is encouraged. The more proactive approaches described above are greatly preferred over mailing or emailing data and will likely save significant time and money in the long run, as the data curation and sharing will be supported by the appropriate journals and repositories or archives. Furthermore, many journals and repositories provide guidelines and mechanisms for how others can appropriately cite your data, including digital object identifiers, and recommended citation formats; this helps ensure that you receive credit for the data products you create. Keep in mind that the data will be more usable and interpretable by you and others if the data are disseminated using standard, nonproprietary approaches and if the data are accompanied by metadata and associated code that is used for data processing.

Rule 9: Assign Roles and Responsibilities

A comprehensive DMP clearly articulates the roles and responsibilities of every named individual and organization associated with the project. Roles may include data collection, data entry, QA/QC, metadata creation and management, backup, data preparation and submission to an archive, and systems administration. Consider time allocations and levels of expertise needed by staff. For small to medium size projects, a single student or postdoctoral associate who is collecting and processing the data may easily assume most or all of the data management tasks. In contrast, large, multi-investigator projects may benefit from having a dedicated staff person(s) assigned to data management.

Treat your DMP as a living document and revisit it frequently (e.g., quarterly basis). Assign a project team member to revise the plan, reflecting any new changes in protocols and policies. It is good practice to track any changes in a revision history that lists the dates that any changes were made to the plan along with the details about those changes, including who made them.

Reviewers and sponsors may be especially interested in knowing how adherence to the data management plan will be assessed and demonstrated, as well as how, and by whom, data will be managed and made available after the project concludes. With respect to the latter, it is often sufficient to include a pointer to the policies and procedures that are followed by the repository where you plan to deposit your data. Be sure to note any contributions by nonproject staff, such as any repository, systems administration, backup, training, or high-performance computing support provided by your institution.

Rule 10: Prepare a Realistic Budget

Creating, managing, publishing, and sharing high-quality data is as much a part of the 21st century research enterprise as is publishing the results. Data management is not new—rather, it is something that all researchers already do. Nonetheless, a common mistake in developing a DMP is forgetting to budget for the activities. Data management takes time and costs money in terms of software, hardware, and personnel. Review your plan and make sure that there are lines in the budget to support the people that manage the data (see Rule 9 ) as well as pay for the requisite hardware, software, and services. Check with the preferred data repository (see Rule 6 ) so that requisite fees and services are budgeted appropriately. As space allows, facilitate reviewers by pointing to specific lines or sections in the budget and budget justification pages. Experienced reviewers will be on the lookout for unfunded components, but they will also recognize that greater or lesser investments in data management depend upon the nature of the research and the types of data.

A data management plan should provide you and others with an easy-to-follow road map that will guide and explain how data are treated throughout the life of the project and after the project is completed. The ten simple rules presented here are designed to aid you in writing a good plan that is logical and comprehensive, that will pass muster with reviewers and research sponsors, and that you can put into practice should your project be funded. A DMP provides a vehicle for conveying information to and setting expectations for your project team during both the proposal and project planning stages, as well as during project team meetings later, when the project is underway. That said, no plan is perfect. Plans do become better through use. The best plans are “living documents” that are periodically reviewed and revised as necessary according to needs and any changes in protocols (e.g., metadata, QA/QC, storage), policy, technology, and staff, as well as reused, in that the most successful parts of the plan are incorporated into subsequent projects. A public, machine-readable, and openly licensed DMP is much more likely to be incorporated into future projects and to have higher impact; such increased transparency in the research funding process (e.g., publication of proposals and DMPs) can assist researchers and sponsors in discovering data and potential collaborators, educating about data management, and monitoring policy compliance [ 6 ].

Acknowledgments

This article is the outcome of a series of training workshops provided for new faculty, postdoctoral associates, and graduate students.

Funding Statement

This work was supported by NSF IIA-1301346, IIA-1329470, and ACI-1430508 ( http://nsf.gov ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Management Plans

A data management plan (DMP) is a formal document that outlines how a researcher intends to manage their research data during and after a project.

Creating a DMP can help you:

  • Make decisions about managing your data and understand the implications of those decisions
  • Identify resources and tools needed for your research
  • Estimate costs for resources and tools and budget accordingly
  • Anticipate and mitigate problems (e.g. data loss, duplication of effort, security breaches)
  • Ensure everyone in your research group understands what is happening with the data

Creating a DMP

A DMP should follow the format specified by a funder or any other relevant requirements. If no format is specified you can create a DMP in whatever way works for you (e.g., a text document or Excel file). You can use the DMP Planning Checklist and Question Guide to help develop your DMP.

DMPs should be created in consultation with all research group members and partners. If you are conducting research with Indigenous communities, your DMP should be co-created with those communities.

Online DMP tools

There are also a number of free, online tools that can help you draft a DMP. These tools provide templates based on discipline, research method, institutional context, and funding requirements.

  • DMP Assistant (Canada, provides templates based on discipline and research method)
  • DMPOnline (United Kingdom, provides templates from various UK funding agencies)
  • DMPTool  (United States, provides templates from various US funding agencies)

Examples of DMPs

  • Canadian examples from the Digital Research Alliance of Canada
  • Canadian examples from DMP Assistant (public plans, not vetted)
  • NIH sample data management and sharing plans
  • US examples from DMPTool (public plans, not vetted)
  • UK examples from the Digital Curation Centre
  • UK examples from DMPonline (public plans, not vetted)

Library resources

  • Data Management Plan (DMP) - Planning Checklist - this checklist can be used to ensure you’ve included all relevant information in a DMP or to help you gather information you will need to complete your DMP.
  • Data Management Plan (DMP) - Question Guide - this guide includes prompting questions and guidance to help you develop a DMP.

Library services The library provides support for:

  • Understanding what information and level of detail to include in a DMP
  • Navigating funder and journal requirements
  • Reviewing DMPs for completeness
  • DMP Assistant

Please note that our services are intended to provide support and guidance. The library cannot validate methodological approaches.

External resources

  • Brief Guide - Create an Effective Data Management Plan (Digital Research Alliance of Canada)
  • Brief Guide - Data Management Plan (Digital Research Alliance of Canada)
  • DMP Assistant Video Tutorial Series (Digital Research Alliance of Canada)
  • Primer - Data Management Plans (Digital Research Alliance of Canada)
  • Guide to Preparing a Data Management Plan (SSHRC - Social Sciences and Humanities Research Council)
  • Data Management Plans for NEH Office of Digital Humanities Proposals and Awards (NEH - National Endowment for the Humanities)
  • Writing a Data Management & Sharing Plan (NIH - National Institutes of Health)
  • Preparing Your Data Management Plan (NSF - National Science Foundation)
  • Costs of Data Management (Utrecht University)
  • Data management costing tool and checklist (UK Data Service)
  • How to Develop a Data Management and Sharing Plan (Digital Curation Centre)

Research Data Management

  • RDM Best Practices, 1 - 2 - 3
  • 1.1 Data Management Plans (DMP)
  • 1.2 Data Organization
  • 1.3. Copyright & Intellectual Property
  • 2.1 Data Documentation & Metadata
  • 2.2 Ethical Issues - Sensitive Data
  • 2.3 Data Storage & Backup
  • 2.4 Data Security
  • 3.1 Data Preservation
  • 3.2 Data Sharing & Citation
  • Education & Training

1.1 Data Management Plans - Page Contents

Checklist with a pencil

Back to Pre-Research Stage

Data Management Plans

Facts About Data Management Plans: 

A Data Management Plan (DMP) is a written living document that formally outlines what you will do with your research data during the course of your research project and afterwards.  It is a living document because any time your research plans change, you should review your DMP in order to make sure that the plan still satisfies your essential data needs.  It’s important to manage your data for many reasons.  Firstly, it enhances the integrity of your research by virtue of increasing access and therefore, the reproducibility of your research data.  Secondly, it safeguards and allows you to share your data for recognition and possibly to facilitate new scientific discoveries.  Lastly, the number of funding agencies that require you to share and preserve your data is growing. Although a DMP can be designed throughout the research cycle (i.e. it’s never too late), it is best to plan for one early on in the research cycle in order to avoid many data management issues/headaches, which can be easily avoided by planning ahead. 

Back to Page Contents

Funding Agency Requirements

Funding Agency Requirements:

Several funding agencies both federal and private require a DMP with every funding application. 

  • SPARC SPARC, which stands for Scholarly Publishing and Academic Resources Coalition, has assembled a great resource about data management and data sharing requirements from all of the federal funding agencies.

Examples of Federal Funding Agencies that Require Data Sharing or a DMP:

  • 2011 - National Science Foundation An extension of the NSF Data Sharing Policy requires all applicants to submit a DMP with their funding request.  Non-compliance could lead to award rejection. 
  • 2013 - National Institutes of Health's Public Access Policy Requires applicants to share their research findings and noncompliance can lead to award delays.

Required stamp

Back to Page Contents

DMP Examples

Examples of Data Management Plans​:

  • NIH Data Sharing Plans
  • DataONE DMP Examples
  • DMPTool This tool provides ample guidance on how to design a DMP for your specific type of research project and for your specific type of funding agency including ones from:
  • The Gordon and Betty Moore Foundation
  • National Institutes of Health (NIH)
  • National Science Foundation (NSF)

The DMPTool will help cater your DMP to the needs/requirements of a specific funding agency.  In general, it is important to consider these things when writing a DMP:

  • Roles and Responsibilities
  • Types of Data  
  • File Formats  
  • Organizing Files  
  • Metadata: Data Documentation  
  • Persistent Identifiers  
  • Security and Storage  
  • Sharing and Access  
  • Data Preservation and Archiving  
  • Citing Data and Data Redistribution  
  • Copyright & Privacy  

Institutional DMPTool Partners:

  • WSU WSU is a DMPTool institutional partner and WSU students and faculty need only to sign in with their WSU login and password.
  • MIT MIT has great DMP questions that help guide the design of your respective DMP.
  • UCLA UCLA has a great DMP template to assist you with your DMP planning and design.
  • Harvard Harvard has a great best practice DMP template for you to use.

Color wheel flower

  • << Previous: 1.0 Pre-Research Stage
  • Next: 1.2 Data Organization >>
  • Last Updated: Jan 4, 2024 11:01 AM
  • URL: https://libguides.libraries.wsu.edu/rdmlibguide

Case Western Reserve University

  • Research Data Lifecycle Guide

Developing a Data Management Plan

This section breaks down different topics required for the planning and preparation of data used in research at Case Western Reserve University. In this phase you should understand the research being conducted, the type and methods used for collecting data, the methods used to prepare and analyze the data, addressing budgets and resources required, and have a sound understanding of how you will manage data activities during your research project.

Many federal sponsors of Case Western Reserve funded research have required data sharing plans in research proposals since 2003. As of Jan. 25, 2023, the National Institutes of Health has revised its data management and sharing requirements. 

This website is designed to provide basic information and best practices to seasoned and new investigators as well as detailed guidance for adhering to the revised NIH policy.  

Basics of Research Data Management

What is research data management?

Research data management (RDM) comprises a set of best practices that include file organization, documentation, storage, backup, security, preservation, and sharing, which affords researchers the ability to more quickly, efficiently, and accurately find, access, and understand their own or others' research data.

Why should you care about research data management?

RDM practices, if applied consistently and as early in a project as possible, can save you considerable time and effort later, when specific data are needed, when others need to make sense of your data, or when you decide to share or otherwise upload your data to a digital repository. Adopting RDM practices will also help you more easily comply with the data management plan (DMP) required for obtaining grants from many funding agencies and institutions.

Does data need to be retained after a project is completed?

Research data must be retained in sufficient detail and for an adequate period of time to enable appropriate responses to questions about accuracy, authenticity, primacy and compliance with laws and regulations governing the conduct of the research. External funding agencies will each have different requirements regarding storage, retention, and availability of research data. Please carefully review your award or agreement for the disposition of data requirements and data retention policies.

A good data management plan begins by understanding the sponsor requirements funding your research. As a principal investigator (PI) it is your responsibility to be knowledgeable of sponsors requirements. The Data Management Plan Tool (DMPTool) has been designed to help PIs adhere to sponsor requirements efficiently and effectively. It is strongly recommended that you take advantage of the DMPTool.  

CWRU has an institutional account with DMPTool that enables users to access all of its resources via your Single Sign On credentials. CWRU's DMPTool account is supported by members of the Digital Scholarship team with the Freedman Center for Digital Scholarship. Please use the RDM Intake Request form to schedule a consultation if you would like support or guidance regarding developing a Data Management Plan.

Some basic steps to get started:

  • Sign into the  DMPTool site  to start creating a DMP for managing and sharing your data. 
  • On the DMPTool site, you can find the most up to date templates for creating a DMP for a long list of funders, including the NIH, NEH, NSF, and more. 
  • Explore sample DMPs to see examples of successful plans .

Be sure that your DMP is addressing any and all federal and/or funder requirements and associated DMP templates that may apply to your project. It is strongly recommended that investigators submitting proposals to the NIH utilize this tool. 

The NIH is mandating Data Management and Sharing Plans for all proposals submitted after Jan. 25, 2023.  Guidance for completing a NIH Data Management Plan has its own dedicated content to provide investigators detailed guidance on development of these plans for inclusion in proposals. 

A Data Management Plan can help create and maintain reliable data and promote project success. DMPs, when carefully constructed and reliably adhered to, help guide elements of your research and data organization.

A DMP can help you:

Document your process and data.

  • Maintain a file with information on researchers and collaborators and their roles, sponsors/funding sources, methods/techniques/protocols/standards used, instrumentation, software (w/versions), references used, any applicable restrictions on its distribution or use.
  • Establish how you will document file changes, name changes, dates of changes, etc. Where will you record of these changes? Try to keep this sort of information in a plain text file located in the same folder as the files to which it pertains.
  • How are derived data products created? A DMP encourages consistent description of data processing performed, software (including version number) used, and analyses applied to data.
  • Establish regular forms or templates for data collection. This helps reduce gaps in your data, promotes consistency throughout the project.

Explain your data

  • From the outset, consider why your data were collected, what the known and expected conditions may be for collection, and information such as time and place, resolution, and standards of data collected.
  • What attributes, fields, or parameters will be studied and included in your data files? Identify and describe these in each file that employs them.
  • For an overview of data dictionaries, see the USGS page here: https://www.usgs.gov/products/data-and-tools/data-management/data-dictionaries

DMP Requirements

Why are you being asked to include a data management plan (DMP) in your grant application? For grants awarded by US governmental agencies, two federal memos from the US Office of Science and Technology Policy (OSTP), issued in 2013 and 2015 , respectively, have prompted this requirement. These memos mandate public access to federally- (and, thus, taxpayer-) funded research results, reflecting a commitment by the government to greater accountability and transparency. While "results" generally refers to the publications and reports produced from a research project, it is increasingly used to refer to the resulting data as well.

Federal research-funding agencies  have responded to the OSTP memos by issuing their own guidelines and requirements for grant applicants (see below), specifying whether and how research data in particular are to be managed in order to be publicly and properly accessible.

  • NSF—National Science Foundation "Proposals submitted or due on or after January 18, 2011, must include a supplementary document of no more than two pages labeled 'Data Management Plan'. This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results." Note: Additional requirements may apply per Directorate, Office, Division, Program, or other NSF unit.
  • NIH—National Institutes of Health "To facilitate data sharing, investigators submitting a research application requesting $500,000 or more of direct costs in any single year to NIH on or after October 1, 2003 are expected to include a plan for sharing final research data for research purposes, or state why data sharing is not possible."
  • NASA—National Aeronautics and Space Administration "The purpose of a Data Management Plan (DMP) is to address the management of data from Earth science missions, from the time of their data collection/observation, to their entry into permanent archives."
  • DOD—Department of Defense "A Data Management Plan (DMP) describing the scientific data expected to be created or gathered in the course of a research project must be submitted to DTIC at the start of each research effort. It is important that DoD researchers document plans for preserving data at the outset, keeping in mind the potential utility of the data for future research or to support transition to operational or other environments. Otherwise, the data is lost as researchers move on to other efforts. The essential descriptive elements of the DMP are listed in section 3 of DoDI 3200.12, although the format of the plan may be adjusted to conform to standards established by the relevant scientific discipline or one that meets the requirements of the responsible Component"
  • Department of Education "The purpose of this document is to describe the implementation of this policy on public access to data and to provide guidance to applicants for preparing the Data Management Plan (DMP) that must outline data sharing and be submitted with the grant application. The DMP should describe a plan to provide discoverable and citable dataset(s) with sufficient documentation to support responsible use by other researchers, and should address four interrelated concerns—access, permissions, documentation, and resources—which must be considered in the earliest stages of planning for the grant."
  • " Office of Scientific and Technical Information (OSTI) Provides access to free, publicly-available research sponsored by the Department of Energy (DOE), including technical reports, bibliographic citations, journal articles, conference papers, books, multimedia, software, and data.

Data Management Best Practices

As you plan to collect data for research, keep in mind the following best practices. 

Keep Your Data Accessible to You

  • Store your temporary working files somewhere easily accessible, like on a local hard drive or shared server.
  • While cloud storage is a convenient solution for storage and sharing, there are often concerns about data privacy and preservation. Be sure to only put data in the cloud that you are comfortable with and that your funding and/or departmental requirements allow.
  • For long-term storage, data should be put into preservation systems that are well-managed. [U]Tech provides several long-term data storage options for cloud and campus. 
  • Don't keep your original data on a thumb drive or portable hard drive, as it can be easily lost or stolen.
  • Think about file formats that have a long life and that are readable by many programs. Formats like ascii, .txt, .csv, .pdf are great for long term  preservation.
  • A DMP is not a replacement for good data management practices, but it can set you on the right path if it is consistently followed. Consistently revisit your plan to ensure you are following it and adhering to funder requirements.

Preservation

  • Know the difference between storing and preserving your data. True preservation is the ongoing process of making sure your data are secure and accessible for future generations. Many sponsors have preferred or recommended data repositories. The DMP tool can help you identify these preferred repositories. 
  • Identify data with long-term value. Preserve the raw data and any intermediate/derived products that are expensive to reproduce or can be directly used for analysis. Preserve any scripted code that was used to clean and transform the data.
  • Whenever converting your data from one format to another, keep a copy of the original file and format to avoid loss or corruption of your important files.
  • Leverage online platforms like OSF can help your group organize, version, share, and preserve your data, if the sponsor hasn’t specified a specific platform.
  • Adhere to federal sponsor requirements on utilizing accepted data repositories (NIH dbGaP, NIH SRA, NIH CRDC, etc.) for preservation. 

Backup, Backup, Backup

  • The general rule is to keep 3 copies of your data: 2 copies onsite, 1 offsite.
  • Backup your data regularly and frequently - automate the process if possible. This may mean weekly duplication of your working files to a separate drive, syncing your folders to a cloud service like Box, or dedicating a block of time every week to ensure you've copied everything to another location.

Organization

  • Establish a consistent, descriptive filing system that is intelligible to future researchers and does not rely on your own inside knowledge of your research.
  • A descriptive directory and file-naming structure should guide users through the contents to help them find whatever they are looking for.

Naming Conventions

  • Use consistent, descriptive filenames that reliably indicate the contents of the file.
  • If your discipline requires or recommends particular naming conventions, use them!
  • Do not use spaces between words. Use either camelcase or underscores to separate words
  • Include LastnameFirstname descriptors where appropriate.
  • Avoid using MM-DD-YYYY formats
  • Do not append vague descriptors like "latest" or "final" to your file versions. Instead, append the version's date or a consistently iterated version number.

Clean Your Data

  • Mistakes happen, and often researchers don't notice at first. If you are manually entering data, be sure to double-check the entries for consistency and duplication. Often having a fresh set of eyes will help to catch errors before they become problems.
  • Tabular data can often be error checked by sorting the fields alphanumerically to catch simple typos, extra spaces, or otherwise extreme outliers. Be sure to save your data before sorting it to ensure you do not disrupt the records!
  • Programs like OpenRefine  are useful for checking for consistency in coding for records and variables, catching missing values, transforming data, and much more.

What should you do if you need assistance implementing RDM practices?

Whether it's because you need discipline-specific metadata standards for your data, help with securing sensitive data, or assistance writing a data management plan for a grant, help is available to you at CWRU. In addition to consulting the resources featured in this guide, you are encouraged to contact your department's liaison librarian.

If you are planning to submit a research proposal and need assistance with budgeting for data storage and or applications used to capture, manage, and or process data UTech provides information and assistance including resource boilerplates that list what centralized resources are available. 

More specific guidance for including a budget for Data Management and Sharing is included on this document: Budgeting for Data Management and Sharing . 

Custody of Research Data

The PI is the custodian of research data, unless agreed on in writing otherwise and the agreement is on file with the University, and is responsible for the collection, management, and retention of research data. The PI should adopt an orderly system of data organization and should communicate the chosen system to all members of a research group and to the appropriate administrative personnel, where applicable. Particularly for long-term research projects, the PI should establish and maintain procedures for the protection and management of essential records.

CWRU Custody of Research Data Policy  

Data Sharing

Many funding agencies require data to be shared for the purposes of reproducibility and other important scientific goals. It is important to plan for the timely release and sharing of final research data for use by other researchers.  The final release of data should be included as a key deliverable of the DMP. Knowledge of the discipline-specific database, data repository, data enclave, or archive store used to disseminate the data should also be documented as needed. 

The NIH is mandating Data Management and Sharing Plans for all proposals submitted after Jan. 25, 2023. Guidance for completing a NIH Data Management and Sharing Plan  has its own dedicated content to provide investigators detailed guidance on development of these plans for inclusion in proposals.

  • AUT Library
  • Library Guides
  • Research Support

Managing your research data

  • Writing a data management plan
  • Research data & data management

Research data management plan

Aut dmp tool, other guides and checklists.

  • Collecting data
  • Managing data
  • Sharing data

A data management plan  ( DMP ) is a formal document that outlines how you will handle your data both during your research, and after the project is completed.  This ensures that data are well-managed in the present, and prepared for preservation in the future.  A DMP  is often required in grant proposals.

A research data management plan is a  living document and should be reviewed and updated regularly.

WSG format for a data management plan mentioned in the video.

Your DMP should include the a brief description about your project and how data will be managed: 

  • Roles and responsibilities
  • Ethics and policies/guidelines compliance 
  • Types of data, data format and documentation
  • Data storage, file backup and security
  • Access, sharing and archiving 

AUT guidelines and policies

  • Principles, policies and codes
  • Data management guidelines - AUT ethical guidelines and confidentiality requirements
  • AUT guide for drafting a data management plan

The  AUT Data Management Planning Tool  makes use of a platform developed and hosted by University of California Digital Library. By using this tool you will create a data management plan based on current AUT data management guidance.

Plans can be drafted on DMPTool and once complete are downloadable in PDF form for your own records. Settings in the tool allow you to control whether your plan is private, institutionally viewable or open to public view.

The questions and structure of the DMPTool have been customised for AUT researchers as part of a joint project between AUT Library and the University Research Office. If you would like to give constructive feedback on the tool please contact:  [email protected]

Sign in to AUT DMPTool:

  • Enter your AUT email to the sign in box
  • On the next screen, click  Sign in with Institution (SSO)

Important:  To access the AUT Template, you must select 'No funder associated with this plan or my funder is not listed' on the Create Plan page.

  • Video - Creating a Data Management Plan (DMP) - Curtin University
  • ANDS guide for Data management plans By Australian National Data Service.
  • Example DMPs Examples on the Digital Curation Centre website (UK).
  • Data management costing tool and checklist  - UK Data Service
  • DDC guidance  - The Digital Curation Centre (DCC) UK
  • Digital Creation Centre (DCC), Edinburgh
  • UK Data Archive
  • << Previous: Research data & data management
  • Next: Collecting data >>
  • Last Updated: Nov 21, 2023 11:28 AM
  • URL: https://aut.ac.nz.libguides.com/RDM

University of Texas

  • University of Texas Libraries

Biosciences for Researchers

  • Data Management Planning
  • Reference Material
  • Article Access Tips & Tools
  • Configuring Easy Access
  • Methods & Protocols This link opens in a new window
  • Metadata Standards
  • Sharing Data
  • UT OA Publishing Deals
  • Open Access
  • Citation Metrics
  • Online Researcher Profile
  • Evaluating Journals
  • Citation Management This link opens in a new window
  • Systematic Reviews

What is a data management plan?

A Data Management Plan (DMP) is a formal document outlining how you will handle data during your research project and after the project is complete.

  • Data management addresses the entire lifecycle of your data--from creation to organization, access, storage, preservation, and distribution.
  • Managing your research data ensures that it will be accessible and usable over time.
  • Managing your data is a key step to avoiding false claims, and to not losing your information.
  • A data management plan may be required by a grant funding agency, plus it's a good idea to organize your data for you, your labmates, and colleagues.

Common Funder Requirements

  • Private Funding

The National Science Foundation has a data management plan requirement in place. The NSF began the requirement in 2011 and has since made updates and clarifications to the requirements. Most recently it has updated the guidelines to comply with the Nelsom Memo (2022).

PAPPG 19  includes details under chapter IX.2.4

https://new.nsf.gov/funding/data-management-plan

Highlights of NSF Requirement:

  • No more than 2 pages
  • Label "Data Management Plan"
  • Part of the merit review
  • Outline a plan to make your publication associated data public

New Requirements take effect on January 25, 2023.

This new requirement will mandate a Data Management & Sharing Plan (DMS Plan) for grants regardless of funding threshold.

New Data Sharing Website

Elements of the Data Management & Sharing Plan are:

  • Related tools, software & code
  • Standards defined by the community that will be used (metadata)
  • Data preservation, access & timelines
  • Access, distribution & resuse considerations
  • Oversight of the plan
  • Bill and Melinda Gates Foundation Data will be available immediately upon publication. 12 month embargo may apply.
  • Simons Foundation Requires a data sharing plan with the application. Post-publication, all associated data must be publicly available. more... less... See 12 - Renewable Reagents and Data Sharing
  • Wellcome Trust Applicants must include an outputs management plan, describing research outputs. Data for research described in published papers should be publicly available.

DMPTool icon and link

  • Data Management Plan General Guidance

The library is happy to provide assistance with data management plans. Please book a time with me or email me to discuss your data management plan.

  • Last Updated: May 23, 2024 1:42 PM
  • URL: https://guides.lib.utexas.edu/biology

Creative Commons License

US Flag Icon

Redirect Notice

NIH Scientific Data Sharing Logo

Writing a Data Management & Sharing Plan

Learn what NIH expects Data Management & Sharing Plans to address, as well as how to submit your Plan.

  • Applications for Receipt Dates BEFORE Jan 25 2023
  • Applications for Receipt Dates ON/AFTER Jan 25 2023

Writing a Data Sharing Plan

Under its 2003 data sharing policy , NIH expects investigators to submit a data sharing plan with requests for funding or grants, cooperative agreements, intramural research, contracts, or other funding agreements of $500,000 or more per year.

Data sharing plans should describe how an applicant will share their final research data. The specifics of the plan will vary on a case-by-case basis, depending on the type of data to be shared and how the investigator plans to share the data.

Examples of information to cover in a data sharing plan include:

  • The expected schedule for data sharing
  • The format of the dataset
  • The documentation to be provided with the dataset
  • Whether any analytic tools also will be provided
  • A brief description of such an agreement
  • Criteria for deciding who can receive the data
  • Whether or not any conditions will be placed on their use
  • Investigators choosing to handle their own data sharing may wish to enter into a data-sharing agreement.
Generating large-scale genomic data? NIH’s Genomic Data Sharing (GDS) policy may also apply to your research. See our GDS Policy Overview  to learn more.

Examples of Data Sharing Plans

The exact content and level of detail to be included in a data sharing plan depends on the specifics of the project, such as how the investigator is planning to share data, or the size and complexity of the dataset. The examples below give a sense of what a data sharing plan can look like. 

Example 1 This application requests support to collect public-use data from a survey of more than 22,000 Americans over the age of 50 every 2 years. Data products from this study will be made available without cost to researchers and analysts. User registration is required in order to access or download files. As part of the registration process, users must agree to the conditions of use governing access to the public release data, including restrictions against attempting to identify study participants, destruction of the data after analyses are completed, reporting responsibilities, restrictions on redistribution of the data to third parties, and proper acknowledgment of the data resource. Registered users will receive user support, as well as information related to errors in the data, future releases, workshops, and publication lists. The information provided to users will not be used for commercial purposes, and will not be redistributed to third parties.

Example 2 The proposed research will include data from approximately 500 subjects being screened for three bacterial sexually transmitted diseases (STDs) at an inner city STD clinic. The final dataset will include self-reported demographic and behavioral data from interviews with the subjects and laboratory data from urine specimens provided. Because the STDs being studied are reportable diseases, we will be collecting identifying information. Even though the final dataset will be stripped of identifiers prior to release for sharing, we believe that there remains the possibility of deductive disclosure of subjects with unusual characteristics. Thus, we will make the data and associated documentation available to users only under a data-sharing agreement that provides for: (1) a commitment to using the data only for research purposes and not to identify any individual participant; (2) a commitment to securing the data using appropriate computer technology; and (3) a commitment to destroying or returning the data after analyses are completed.

Example 3 The proposed research will involve a small sample (less than 20 participants) recruited from clinical facilities in the New York City area with Williams syndrome. This rare craniofacial disorder is associated with distinguishing facial features. Even with the removal of all identifiers, we believe that it would be difficult if not impossible to protect the identities of subjects given the physical characteristics of subjects, the type of clinical data (including imaging) that we will be collecting, and the relatively restricted area from which we are recruiting subjects. Therefore, we are not planning to share the data.

What data that will be shared:

I will share phenotypic data associated with the collected samples by depositing these data at ________________ which is an NIH-funded repository.  Genotype data will be shared by depositing these data at ________________.  Additional data documentation and de-identified data will be deposited for sharing along with phenotypic data, which includes demographics, family history of XXXXXX disease, and diagnosis, consistent with applicable laws and regulations.  I will comply with the NIH GWAS Policy and the funding IC’s existing policies on sharing data on XXXXXX disease genetics to include secondary analysis of data resulting from a genome wide association study through the repository.  Meta-analysis data and associated phenotypic data, along with data content, format, and organization, will be available at ____________.  Submitted data will confirm with relevant data and terminology standards.

Who will have access to the data:

I agree that data will be deposited and made available through ________________ which is an NIH-funded repository, and that these data will be shared with investigators working under an institution with a Federal Wide Assurance (FWA) and could be used for secondary study purposes such as finding genes that contribute to process of XXXXXX.  I agree that the names and Institutions of persons either given or denied access to the data, and the bases for such decisions, will be summarized in the annual progress report.  Meta-analysis data and associated phenotypic data, along with data content, format, and organization, will be made available to investigators through ____________.

Where will the data be available:

I agree to deposit and maintain the phenotypic data, and secondary analysis of data (if any) at ________________, which is an NIH-funded repository and that the repository has data access policies and procedures consistent with NIH data sharing policies.

When will the data be shared:

I agree to deposit genetic outcome data into ________________ repository as soon as possible but no later than within one year of the completion of the funded project period for the parent award or upon acceptance of the data for publication, or public disclosure of a submitted patent application, whichever is earlier.

How will researchers locate and access the data:

I agree that I will identify where the data will be available and how to access the data in any publications and presentations that I author or co-author about these data, as well as acknowledge the repository and funding source in any publications and presentations.  As I will be using ________________, which is an NIH-funded repository, this repository has policies and procedures in place that will provide data access to qualified researchers, fully consistent with NIH data sharing policies and applicable laws and regulations.

How to Submit Data Sharing Plans

The plan should be included in the Resource Sharing section  of the application. See the  How to Apply – Application Guide  for form instructions.

Writing a Data Management and Sharing Plan

Under the 2023 Data Management and Sharing (DMS) Policy , NIH expects researchers to maximize the appropriate sharing of scientific data, taking into account factors such as legal, ethical, or technical issues that may limit the extent of data sharing and preservation.

NIH requires all applicants planning to generate scientific data to prepare a DMS Plan that describes how the scientific data will be managed and shared. For more on what constitutes scientific data, see Research Covered Under the Data Management & Sharing Policy .

Applications subject to NIH’s Genomic Data Sharing (GDS) Policy should also address GDS-specific considerations within the elements of a DMS Plan (see NOT-OD-22-189 and details below).

Submitting Data Management and Sharing Plans

The DMS Plan should be submitted as follows:

  • DMS Plans should be included within the “Other Plan(s)” field on the PHS 398 Research Plan or PHS 398 Career Development Award Supplemental Form as indicated in the Application Instructions . See below for details on developing and formatting Plans.
  • A brief summary and associated costs should be submitted as part of the budget and budget justification (see Budgeting for Data Management and Sharing and the Application Instructions for details).
  • Extramural (contracts) : as part of the technical evaluation
  • Intramural : determined by the Intramural Research Program
  • Other funding agreements : prior to the release of funds

Data Management and Sharing Plan Format

DMS Plans are recommended to be two pages or less in length.

NIH has developed an optional DMS Plan format page that aligns with the recommended elements of a DMS Plan.

Important: Do not include hypertext (e.g., hyperlinks and URLs) in the DMS Plan attachment.

what is a data management plan in research

Elements to Include in a Data Management and Sharing Plan

As outlined in NIH Guide Notice Supplemental Policy Information: Elements of an NIH Data Management and Sharing Plan , DMS Plans should address the following recommended elements and are recommended to be two pages or less in length. As described in the Application Guide, the DMS Plan should be attached to the application as a PDF file. See NIH’s Format Attachments page.

1. Data Type

Briefly describe the scientific data to be managed and shared:

  • Summarize the types (for example, 256-channel EEG data and fMRI images) and amount (for example, from 50 research participants) of scientific data to be generated and/or used in the research. Descriptions may include the data modality (e.g., imaging, genomic, mobile, survey), level of aggregation (e.g., individual, aggregated, summarized), and/or the degree of data processing.
  • Describe which scientific data from the project will be preserved and shared. NIH does not anticipate that researchers will preserve and share all scientific data generated in a study. Researchers should decide which scientific data to preserve and share based on ethical, legal, and technical factors. The plan should provide the reasoning for these decisions.

A brief listing of the metadata, other relevant data, and any associated documentation (e.g., study protocols and data collection instruments) that will be made accessible to facilitate interpretation of the scientific data

For data subject to the GDS Policy: Data types expected to be shared under the GDS Policy should be described in this element. Note that the GDS Policy expects certain types of data to be shared that may not be covered by the DMS Policy’s definition of “scientific data”. For more information on the data types to be shared under the GDS Policy, consult Data Submission and Release Expectations .

2. Related Tools, Software and/or Code

Indicate whether specialized tools are needed to access or manipulate shared scientific data to support replication or reuse, and name(s) of the needed tool(s) and software. If applicable, specify how needed tools can be accessed.

3. Standards

Describe what standards, if any, will be applied to the scientific data and associated metadata (i.e., data formats, data dictionaries, data identifiers, definitions, unique identifiers, and other data documentation).

4. Data Preservation, Access, and Associated Timelines

Give plans and timelines for data preservation and access, including:

  • The name of the repository(ies) where scientific data and metadata arising from the project will be archived. See Selecting a Data Repository for information on selecting an appropriate repository.
  • How the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools.

When the scientific data will be made available to other users and for how long. Identify any differences in timelines for different subsets of scientific data to be shared.

  • Note that NIH encourages scientific data to be shared as soon as possible, and no later than the time of an associated publication or end of the performance period, whichever comes first. NIH also encourages researchers to make scientific data available for as long as they anticipate it being useful for the larger research community, institutions, and/or the broader public.
For data subject to the GDS Policy: For human genomic data: Investigators are expected to submit data to a repository acceptable under the Genomic Data Sharing Policy. See Where to Submit Genomic Data . Human genomic data is expected to be shared according to NIH’s Data Submission and Release Expectations , but no later than the end of the performance period, whichever comes first. For Non-human genomic data: Investigators may submit data to any widely used repository. Non-human genomic data is expected to be shared as soon as possible, but no later than the time of an associated publication, or end of the performance period, whichever is first.

5. Access, Distribution, or Reuse Considerations

Describe any applicable factors affecting subsequent access, distribution, or reuse of scientific data related to:

  • Informed consent
  • Privacy and confidentiality protections consistent with applicable federal, Tribal, state, and local laws, regulations, and policies
  • Whether access to scientific data derived from humans will be controlled
  • Any restrictions imposed by federal, Tribal, or state laws, regulations, or policies, or existing or anticipated agreements

Any other considerations that may limit the extent of data sharing. Any potential limitations on subsequent data use should be communicated to the individuals or entities (for example, data repository managers) that will preserve and share the scientific data. The NIH ICO will assess whether an applicant’s DMS plan appropriately considers and describes these factors. For more examples, see Frequently Asked Questions for examples of justifiable reasons for limiting sharing of data.

Expectations for human genomic data subject to the GDS Policy: Informed Consent Expectations: For research involving the generation of large-scale human genomic data from cell lines or clinical specimens that were created or collected AFTER the effective date of the GDS Policy (January 25, 2015): NIH expects that informed consent for future research use and broad data sharing will have been obtained. This expectation applies to de-identified cell lines or clinical specimens regardless of whether the data meet technical and/or legal definitions of de-identified (i.e. the research does not meet the definition of “human subjects research” under the Common Rule). For research involving the generation of large-scale human genomic data from cell lines or clinical specimens that were created or collected BEFORE the effective date of the GDS Policy: There may or may not have been consent for research use and broad data sharing. NIH will accept data derived from de-identified cell lines or clinical specimens lacking consent for research use that were created or collected before the effective date of this Policy. Institutional Certifications and Data Sharing Limitation Expectations: DMS Plans should address limitations on sharing by anticipating sharing according to the criteria of the Institutional Certification . In cases where it is anticipated that Institutional Certification criteria cannot be met (i.e., data cannot be shared as expected by the GDS Policy), investigators should state the institutional Certification criteria in their DMS Plan, explaining why the element cannot be met, and indicating what data, if any, can be shared and how to enable sharing to the maximal extent possible (for example, sharing data in a summary format). In some instances, the funding NIH ICO may need to determine whether to grant an exception to the data submission expectation under the GDS Policy. Genomic Summary Results: Investigators conducting research subject to the GDS Policy should indicate in their DMS Plan if a study should be designated as “sensitive” for the purposes of access to Genomic Summary Results (GSR), as described in NOT-OD-19-023 .

6. Oversight of Data Management and Sharing

Indicate how compliance with the DMS Plan will be monitored and managed, the frequency of oversight, and by whom (e.g., title, roles). This element refers to oversight by the funded institution, rather than by NIH. The DMS Policy does not create any expectations about who will be responsible for Plan oversight at the institution.

Sample Plans

NIH has provided sample DMS Plans as examples of how a DMS Plan could be completed in different contexts, conforming to the elements described above. These sample DMS Plans are provided for educational purposes to assist applicants with developing Plans but are not intended to be used as templates and their use does not guarantee approval by NIH.

Note that the sample DMS Plans provided below may reflect additional expectations established by NIH or specific NIH Institutes, Centers, or Offices that go beyond the DMS Policy. Applicants will need to ensure that their Plan reflects any additional, applicable expectations (including from NIH policies and any ICO- or program-specific expectations as stated in the FOA).

Assessment of Data Management and Sharing Plans

Program staff at the proposed NIH Institute or Center (IC) will assess DMS Plans to ensure the elements of a DMS Plan have been adequately addressed and to assess the reasonableness of those responses. Applications selected for funding will only be funded if the DMS Plan is complete and acceptable.

During peer review, reviewers will not be asked to comment on the DMS Plan nor will they factor the DMS Plan into the Overall Impact score, unless sharing data is integral to the project design and specified in the funding opportunity (see NOT-OD-22-189 ).

If data sharing is integral to the project and tied to a scored review criterion in the funding opportunity, program staff will assess the adequacy of the DMS Plan per standard procedure, but peer reviewers will also be able to view the DMS Plan attachment and may factor that information into scores as outlined in the evaluation criteria.

For information about budget assessment by peer reviewers, see Budgeting for Data Management and Sharing .

Revising Data Management and Sharing Plans

Pre-Award Plan Revisions: If the DMS Plan provided in the application cannot be approved based on the information provided, applicants will be notified that additional information is needed. This will occur through the Just-in-Time (JIT) process. Applicants will be expected to communicate with their Program Officer and/or Grants Management Specialist to resolve any issues that prevent the funding IC from approving the DMS Plan. If needed, applicants should submit a revised DMS Plan. Refer to NIH Grants Policy Statement Section 2.5.1 Just-in-Time Procedures for additional guidance.

Post-Award Plan Revisions: Although investigators submit plans before research begins, plans may need to be updated or revised over the course of a project for a variety of reasons for example, if the type(s) of data generated change(s), a more appropriate data repository becomes available, or if the sharing timeline shifts. If any changes occur during the award or support period that affects how data is managed or shared, investigators should update the Plan to reflect the changes. It may be helpful to discuss potential changes with the Program Officer. In addition, the funding NIH ICO will need to approve the updated Plan. NIH staff will monitor compliance with approved DMS Plans during the annual RPPR process as well. For more details, please refer to NOT-OD-23-185: Prior Approval Requests for Revisions to an Approved Data Management and Sharing (DMS) Plan Must be Submitted Using the Prior Approval Module .

Additional Considerations

Note that funding opportunities or ICs may have specific expectations (for example: scientific data to share, relevant standards, repository selection). View a list of NIH Institute or Center data sharing policies . Investigators are encouraged to reach out to program officers with questions about specific ICO requirements.

Please note that a Plan is part of an application, and, as such, an institution takes responsibility for the Plan and the rest of the application's contents when submitting an application. Although part of the official submission, when not considered during peer review the attachment is maintained as a separate “Data Management and Sharing (DMS) Plan” document in the grant folder viewable via the Status Information screen in eRA Commons. This document is viewable by authorized users and is not part of the assembled e-Application.

New Data Management & Sharing Policy Effective January 25, 2023!

Related resources.

Selecting a Data Repository

Budgeting for Data Management & Sharing

Data Management

NIH Institute or Center Data Sharing Policies

X

Library Services

Managing data across the research lifecycle

Menu

Learn more about using the research data lifecycle to inform your data management planning

  • What are research data at UCL?

What is the research data lifecycle?

What is a data management plan, why are data management plans useful, before you get started, dmp training and review service, ucl research data policy, what are research data at ucl.

According to the UCL Research Data policy , data are:  “facts, observations or experiences on which an argument or theory is constructed or tested. Data may be numerical, descriptive, aural or visual. Data may be raw, abstracted or analysed, experimental or observational. Data include but are not limited to: laboratory notebooks; field notebooks; questionnaires; texts; audio files; video files; models; photographs; test responses”.

There are three kinds of data: 

  • Open - data which are freely available online;
  • Controlled/restricted   - data access is restricted on the basis of there being ethical, legal and/or commercial reasons prohitbiting their open release. Potential secondary users must meet certain criteria before access is given; 
  • Closed - data which are permanently embargoed due to their nature.

The research data lifecycle models the different phases of the research process - from planning and preparation through to archiving and sharing - making your research and outputs discoverable to the wider research community and members of the public. There are four phases:

  • Planning and preparation - You've had an idea for a research study so it's time to start making plans and getting prepared. It's usually during this phase you will write a data management plan and perhaps submit it as part of a grant application.
  • Active research - You are now actively researching putting all those research plans into action.
  • Archiving, curating and preserving - The research is complete and it's time to archive your research outputs to preserve them for the longer-term.
  • Discovery, access and sharing - making your research discoverable to others for potential reuse can help to maximise research opportunities.

what is a data management plan in research

Four phases of the research data lifecycle - Planning and preparation; Active research; Archiving, curating and preserving; Discovery, access and sharing.

  • Download high resolution  Research Data Lifecycle  (PDF)

A Data Management Plan (DMP) describes your planned and/or actioned data management and sharing activities. It is generally 1-3 pages in length and should cover the four phases of the research data lifecycle. It is generally written at the start of a research project and should be revisted at different stages of the project and updated where necessary. DMPs may be published in the UCL Research Data Repository and assigned a DOI.

When writing your plan, remember to check if any  funder's policies and requirements  apply to your rseearch. A range of how-to guides  are also available to assist you in writing your plan.

UCL Data Management Plan template

Download our Data Management Plan template (MS Word)

Guidance is provided as comments in the margins.

In addition to often being a prerequisite to receiving certain grants, DMPs are useful for:

  • maximising the research potential of existing research outputs by reusing and repurposing them
  • thinking about and developing your strategy for issues such as data storage and long-term preservation , handling of sensitive data , data retention and sharing , early on in your research.
  • anticipating legal, ethical and commercial exceptions to releasing data ; deciding who can have access to data in the short and long term.
  • estimating the costs of your research project , which can then be included in your project budget.

Here are a few tips to help you start writing a DMP:

  • Verify which data management and data sharing policies apply - these could be institutional, funder or journal publisher-led.
  • Identify whether you will need to enter into a data sharing agreement before datasets and other study materials may be shared. There could also be legal frameworks and copyright issues to be mindful of. There is more information about material transfer agreements .
  • Where research involves living human participants, it is recommended you speak with the Data Protection team to confirm which data protection legislation apply. Where you are collaborating with partners based globally, confirm whether international data protection legislation apply to your research.
  • Verify submission deadlines.

The RDM team offers both face-to-face and online training courses on how to write a data management plan. Using the UCL DMP template, attendees have the opportnity to write a data management plan which they can take away with them and use as a basis for a more detailed plan of their data management and sharing activities.  

For more help and advice, contact your Research Data Support Officers who can also review drafted UCL Data Management Plans if you send them in advance of submission (allow 1 to 2 weeks at least before your submission deadline).

The UCL Research Data Policy describes UCL's expectations relating to data management and sharing within the wider Open Science context. 

DMPonline , a free tool created by the DCC, provides a framework for creating your Data Management Plan. UCL guidance is now incorporated into DMPonline; see our further guidance on using the tool.

Data management plans

All University of Oxford researchers, whether funded or not, are encouraged to create a data management plan (DMP) as part of good scholarly practice. It's also common for funding bodies to require a DMP to be included as part of any funding application. 

About data management plans

What is a data management plan why should i consider making one.

A data management plan is a document which outlines how data will be managed throughout the whole project life cycle. Generally a plan covers initial decisions, how data will be handled during the active phase of research, and longer-term questions of preservation and sharing. The plan may be updated and revised as a project develops.

Even when not explicitly required by a funding body, a DMP is well worth creating. The planning process is a chance to think through what's needed to allow the project to run as smoothly as possible. It can also help anticipate possible problems before they occur – meaning that solutions can be found in good time.

Many aspects of data management are straightforward if they’re planned for from the beginning, but much harder to do retrospectively. Making a plan will therefore often save time and reduce stress later in the project.

Planning ahead can bring particular benefits when it comes to preparing data for sharing. For example, documenting what’s happened to data can be done quickly and easily if good recording processes are built into the research methodology; trying to unpick what’s been done later on is likely to be much more arduous.

Having a solid plan also means you're better prepared for unforeseen developments. Having thought through all the relevant issues means you're less likely to be taken by surprise - and you'll be better placed to respond if the unexpected does crop up.

Does my funder require a data management plan?

Many major funding bodies now require a data management plan as part of the grant application process. Please see the  Funder requirements  section for information about specific funders.

Can a data management plan be revised during the course of a project?

Absolutely! A DMP should be treated as a dynamic, evolving document, to be updated as necessary throughout the project. It is therefore good practice for the initial plan to include a schedule for future review, revision and perhaps even use a file name with version/date details.

What should a data management plan include?

If your funder requires a data management plan, they will usually provide a template or a set of guidelines about what to include.

Templates based on the requirements of a range of major funders are available via the  DMPonline  tool. There's also a generic template, which is ideal if you're creating a plan for your own benefit.

Whichever template is used, a DMP will typically cover the elements listed below.

Description of data and related materials being created

Outline the content, quantity, and format of the data. This will help inform the rest of the plan: for example, the quantity of material will have an impact on the type of storage that is appropriate.

You may also need to discuss how documentation, metadata, and software will be created and maintained.

Further information on some of the above topics can be found in the  Data handling and acquisition  section of this site.

Handling of data and key responsibilities

Provide an overview of how research data will be collected, processed, stored, and otherwise dealt with during the research project.

If a team of researchers is involved, responsibilities can be assigned as appropriate. The PI will typically retain overall responsibility for data management, but may delegate specific tasks (e.g. managing data collection or documentation). Some projects may have a designated data manager, field workers, or similar site specific personnel.

Further information can be found in the  Data handling and acquisition  section of this site.

Data security

Outline how you will use institutional services and infrastructure to ensure data is secure and backed up. This is particularly important where confidential or sensitive materials are being used.

Further information can be found in the  Keeping working data safe  section of this site.

Ethical issues

You'll need to show that you're aware of any ethical issues raised by the collection and handling of data, and that you have a plan in place to deal with these. For example, this might include the need for informed consent from research participants, secure storage for confidential material, or anonymisation or redaction of data which will be made available for reuse.

Further information can be found in the  Ethical and legal issues  section of this site.

Intellectual property and legal issues

If you're using third party data, give details of any permissions that are required (and the process for securing these). For collaborative projects, you may need to clarify ownership of data and other outputs resulting from the research project.

If your research project involves personal data, you'll need to set out your plans for ensuring compliance with the relevant data protection legislation (including GDPR).

Other legal issues may also need to be covered, depending on the nature of the project - for example, if the project may lead to a commercial spin-out, or if a collaboration with the commercial sector is required.

Long-term preservation

Indicate what you intend to do with data, software and metadata after the research project concludes. If data will be deposited in an archive or repository, say which one; if not, alternative preservation arrangements should be described. Some funders specify a minimum preservation period, and/or a preferred destination for data, so it's worth ensuring you're aware of any specific requirements. 

If you are planning to destroy any of your data once the project is complete, the reasons for this should be made clear. If secure deletion is required, the process for this should also be covered.

Further information can be found in the  Post-project data preservation  section of this site.

Data sharing and access

Outline your plans for sharing data for reuse by other researchers (or more widely). If some or all data is unsuitable for sharing, or can only be shared with restrictions, explain why this is. Any processing or preparation of data needed (e.g. anonymisation of personal data) should also be described.

As with preservation, some funders may have specific requirements regarding making data available for reuse, so it's worth checking you know what these are.

Further information can be found in the  Sharing data  section of this site.

What help is available for writing a data management plan?

Dmponline - a web-based tool for creating data management plans.

DMPonline  is a free, web-based tool that you can use to build a data management plan. It provides a selection of templates (including those mandated by major UK funders) along with guidance and links to useful resources. Plans can be shared with collaborators, and exported in a range of formats.

If you indicate that your institution is the University of Oxford, you will be given the option of seeing Oxford-specific guidance.

DMPonline is maintained by the  Digital Curation Centre , a national body whose website provides a wealth of additional information, including details of  funders’ requirements ,  example DMPs , a useful  DMP checklist , and a  how-to guide  on creating one.

University of Oxford support and resources

The Research Data Oxford team can provide assistance with drafting DMPs: email  [email protected]  if you would like to set up a meeting, or to request feedback on a draft.

There are also regular training courses, provided via the  IT Learning Centre  and the  iSkills programme .

Additionally, a range of local and central support is available for specific aspects of a plan: research facilitators will be able to help prepare project costings with X5, CUREC will help you through the ethics approval process, and local and central IT staff can advise on technical infrastructure and resources.

Main Library Logo

Research Data Services

  • Federal Public Access Mandates

Research Data Management Overview

Dmp resources, research resources at uga.

  • ORCID ID This link opens in a new window
  • Research Data Repositories
  • Scholarly Publications Information This link opens in a new window

What is a Data Management Plan?

A data management plan (DMP) outlines the details of how the data for a research project is handled, during and after the research project. They are required by many funding agencies as part of their application and are considered a best practice for research. On this page, you will find resources for writing your data management plan, the expectations for data sharing, and additional resources at UGA. 

Elements in a Data Management Plan:

  • the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project;
  • the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies);
  • policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements;
  • policies and provisions for re-use, re-distribution, and the production of derivatives; and
  • plans for archiving data, samples, and other research products, and for preservation of access to them.

Writing Data Management Plans

  • Build your data management plan  - The DMPTool helps with creating data management plans that meet funder requirements.
  • Preparing your data management plan  - This page provides an overview of requirements for the data management plan.
  • Proposal Preparation Instructions

Expectations for Data Sharing

  • Browse Article and Data Sharing Requirements by Federal Agency, by SPARC
  • Desirable Characteristics of Data Repositories for Federally Funded Research, by National Science and Technology Council (Subcommittee on Open Science), May 2022
  • Selecting a Data Repository, by NIH
  • Data Sharing Policies and Expectations, by NIH

Basics for the National Science Foundation

  • Research IT
  • << Previous: Federal Public Access Mandates
  • Next: ORCID ID >>
  • Last Updated: Jan 30, 2024 9:51 PM
  • URL: https://guides.libs.uga.edu/researchdata

Master data management: The key to getting more from your data

Picture this: a sales representative at a multibillion-dollar organization has an upcoming meeting with a prospective client. She searches for the client in the organization’s customer relationship management software and finds several accounts with the same name. She struggles to learn more about the products and services the client is already buying, the customer contacts that have already been engaged, and the relationships the contact may have with other sales representatives within the organization. As a result, the sales representative spends several hours manually pulling together information to get organized for the upcoming meeting.

About the authors

This article is a collaborative effort by Aziz Shaikh, Holger Harreis , Jorge Machado , and Kayvaun Rowshankish , with Rachit Saxena and Rajat Jain, representing views from McKinsey Digital.

This scenario is an example of poor master data management (MDM), which commonly results in suboptimal customer and employee experience, higher costs, and lost revenue opportunities. MDM is a critical component of any organization’s data strategy (see sidebar “About master data management”). These capabilities can make or break an organization’s efficiency and reliability—particularly in complex organizations with multiple business units, where data silos can lead to inefficiencies and errors.

About master data management

Typically, organizations have four types of data: transaction, reference, derived, and master. Of these, master data provides the most relevant, foundational information about entities and their attributes, unique identifiers, hierarchies, and relationships within an organization. This information is shared across business functions and systems to support business processes and decision making.

In 2023, McKinsey surveyed more than 80 large global organizations 1 Companies surveyed earned more than $100 million in annual revenue. across several industries to learn more about how they organize, use, and mature their master data. McKinsey’s Master Data Management Survey indicated that organizations have four top objectives in maturing their MDM capabilities: improving customer experience and satisfaction, enhancing revenue growth by presenting better cross- and up-selling opportunities, increasing sales productivity, and streamlining reporting (Exhibit 1).

MDM plays an important role with modern data architecture concepts and creates value in five ways:

  • MDM cleans, enriches, and standardizes data for key functions, such as customer or product data, before it is loaded into the data lake. In this way, MDM ensures that data is accurate, complete, and consistent across an organization.
  • In the context of data products, MDM provides a hub for high-quality data across entities, which improves the effectiveness, consistency, and reliability of data products for improved decision making, accurate reporting and analysis, and compliance with local regulations and standards.
  • MDM standardizes data across entities to provide a unified view across various systems.
  • MDM can act as a system of reference that shares data with applications and other domains via web services, typically representational state transfer application programming interfaces (REST APIs).
  • MDM and artificial intelligence (AI) can benefit from each other. For instance, MDM can leverage AI algorithms to identify duplicate records and merge them intelligently, which can enhance the performance and reliability of generative AI systems.

But many organizations have not fully harnessed the potential of MDM. This article builds on the insights from our MDM survey, describes the common challenges companies face when integrating MDM capabilities, and highlights areas in which MDM could be optimized to help businesses gain a competitive advantage.

Common issues organizations face when implementing MDM

Small and large organizations alike can benefit from implementing MDM models, yet collecting and aggregating quality data can be difficult because of funding constraints, insufficient technological support, and low-caliber data. Based on our survey results, following are some of the most prevalent challenges to implementing MDM.

Difficulty of making a business case

Demonstrating potential savings through reduced data errors, enhanced operational efficiency, and improved decision making can provide a clear return on investment for MDM initiatives. However, this return is inherently difficult to quantify, so positioning MDM as a priority ahead of projects with more visible, immediate benefits can be challenging. Consequently, despite MDM’s potential to enhance an organization, leaders may have a difficult time building a business case for augmenting their MDM and investing in associated architecture and technology capabilities.

Never just tech

Creating value beyond the hype

Let’s deliver on the promise of technology from strategy to scale.

Organizational silos

Types of master data domains.

A variety of categories can serve as master data domains, and each serves a specific purpose. The most common categories include the following:

Customer data. Customer data includes key details such as customer contact information, purchasing history, preferences, and demographic data. Organizations can leverage customer data to optimize marketing strategies, personalize customer experiences, and foster long-term relationships.

Client data. Client data typically includes client names, contact information, billing and shipping addresses, payment terms, key decision makers, and other client-specific identifiers. Business-to-business (B2B) organizations can manage client data to tailor their strategies, personalize communications, and optimize sales and marketing efforts to better serve their clients’ needs and preferences.

Product data. Product data includes attributes such as product names, descriptions, SKUs, pricing, and specifications. Product data typically spans across R&D, supply chain, and sales.

Supplier data. Supplier data includes attributes such as vendor names, contact details, payment terms, tax information, and vendor-specific codes. Accurate supplier data helps to establish a single, complete, and consistent definition of vendors across the organization.

Financial data. Financial data typically includes information about legal or management entities (a company code, for instance), a chart of accounts, cost and profit centers, and financial hierarchies.

Employee data. Employee data includes attributes such as employee names, contact information, job titles, employee IDs, department assignments, and payroll information.

Asset data. Asset data includes attributes such as asset name, type, purchase date, installation date, manufacturer details, financial and depreciation details, and maintenance and repair details. Organizations can improve their operational performance by maintaining consistent, accurate, and efficient management of assets across an organization.

According to the McKinsey Master Data Management Survey 2023, 83 percent of organizations consider client and product data to be the most dominant domains.

Eighty percent of organizations responding to our survey reported that some of their divisions operate in silos, each with its own data management requirements, practices, source systems, and consumption behaviors. For example, a sales team may maintain client data in a customer relationship management (CRM) system, while a marketing team may use a client data platform (CDP) to create customer profiles and inform ad campaigns. Silos can lead to inconsistencies and errors, increasing the difficulty of making decisions related to business, data, and technology (see sidebar “Types of master data domains”).

Treating MDM as a technology discipline only

Organizations typically think of MDM as a technology discipline rather than as a differentiator that can drive enterprise value. According to our survey, only 16 percent of MDM programs are funded as organization-wide strategic programs, leaving IT or tech functions to carry the financial responsibility (Exhibit 2). Sixty-two percent of respondents reported that their organizations had no well-defined process for integrating new and existing data sources, which may hinder the effectiveness of MDM.

While technology plays a crucial role, the success of MDM initiatives requires significant business influence and sponsorship to set the strategic direction, understand data dependencies, improve the quality of data, enhance business processes, and, ultimately, support the organization in achieving its goals. It’s important for the role of data owner to be played by a business stakeholder—specifically, the head of the business unit that uses the data most, such as the head of sales and marketing for the client data domain. That leader can provide guidance for defining data requirements and data quality rules that are aligned with the business’s goals.

Poor data quality

Poor-quality data cannot deliver analytics-based insights without substantial manual adjustment. According to the MDM survey, 82 percent of respondents spent one or more days per week resolving master data quality issues, and 66 percent used manual review to assess, monitor, and manage the quality of their master data. Consequently, large, multidivisional organizations may be unable to efficiently generate KPIs or other metrics, and sales representatives may be unable to quickly generate a consistent, holistic view of prospective clients. According to the MDM survey, the most prevalent issues in organizations’ data quality were incompleteness, inconsistency, and inaccuracy (Exhibit 3).

In addition to incompleteness, inconsistency, and accuracy, many companies also contend with issues of uniqueness, or duplicate information, across systems. Traditionally, organizations classify data assets based on the stakeholders they interact with, but this approach can lead to duplication of information. For example, a supplier to an organization can also be its customer. These circumstances have led to the design of a “party” data domain that generalizes the characteristics of a person or organization and establishes the connection between them and their distinctive roles to the company.

Master data quality issues can cause customer dissatisfaction, operational inefficiencies, and poor decision making. Furthermore, companies handling private or sensitive consumer information have stricter compliance requirements and data quality, security, and privacy standards. Without good data, implementing MDM processes will be difficult.

Complex data integration requirements

Organizations may find it difficult to integrate MDM into their existing systems. Compatibility issues, data migration challenges, and system upgrades can hinder successful MDM implementation, and minimizing integration latency is crucial to provide timely and accurate data to the MDM system. Organizations may have to significantly model, map, and transform data systems so they can work with newer and older technologies.

How to effectively implement and optimize MDM capabilities

To overcome these challenges and successfully implement and optimize MDM capabilities, organizations must clearly identify the value they hope to create based on their priority business use cases such as operational efficiency and customer insights, which lead to cost savings and revenue growth. Organizations should measure the impact and effectiveness of MDM implementation using metrics such as ROI, total cost of ownership, and performance baselines. Organizations should maintain a forward-looking approach to adopt modern tools and technologies; create a robust data governance model backed by performance KPIs; and plan for capability building among stakeholders to ensure a uniform adoption of MDM principles.

High population density abstract city - stock photo

The data dividend: Fueling generative AI

Build a ‘golden record’ that contains the most up-to-date information.

An MDM “golden record” is a repository that holds the most accurate information available in the organization’s data ecosystem. For example, a golden record of client data is a single, trusted source of truth that can be used by marketing and sales representatives to analyze customer preferences, trends, and behaviors; improve customer segmentation; offer personalized products and services; and increase cross-sales, interactions, customer experiences, and retention.

To build a golden record that contains the most up-to-date information, organizations integrate data from every business unit into the golden record and update it as more accurate information becomes available. Integrating information can be done with the help of AI and machine learning (ML) technology. Alternatively, organizations may establish one existing system as the golden record for a specific data domain to maintain consistency, precision, and timeliness across the enterprise.

Four common master data management design approaches

Organizations typically use one of four master data management design approaches, depending on the complexity of their data:

Registry MDM. This model aggregates data from multiple sources to spot duplicates in information. It is a simple, inexpensive approach that large, global organizations with many data sources often find helpful.

Consolidation MDM. This approach periodically sorts and matches information from multiple source systems to create or update the master data record. Simple and inexpensive to set up, it is a good option for organizations seeking to analyze large sets of data.

Centralized MDM. This approach establishes a single master repository to create, update, and maintain data, and shares it back with the respective source systems. This model is good for banks, insurance companies, government agencies, and hospital networks that require strict compliance to maintain integrity and control over their data.

Coexistence MDM. This approach creates and updates data in source systems, giving businesses the flexibility and autonomy to manage data attributes at the division or business-unit level while maintaining consistent core client data. This model is especially good for large, complex enterprises with many segments and business-unit structures that are frequently integrating new clients into their databases.

Organizations typically start by deploying more rudimentary MDM models, such as registry or consolidation, then evolve to more mature approaches, such as centralized or coexistence. These more mature models are more flexible but also more complex. When choosing an MDM deployment approach, organizations should consider the following questions, among others:

  • How should the organization centralize and streamline master data across different systems and locations to maximize accessibility and usability?
  • What methodologies should be used to manage the complexity of data relationships and structures to improve efficiency and interoperability across systems?
  • What strategies need to be implemented to enable real-time master data updates and guarantee instant access to the most current and accurate information?
  • How should the organization maintain consistent, high-quality data across all departments to support data-driven decision making?
  • What initiatives need to be implemented to empower business units to increase autonomy and maturity, fostering innovation and agility throughout the organization?
  • Which systems must be seamlessly integrated with the MDM strategy to establish a cohesive and unified data ecosystem?
  • How should MDM support and enhance current and future business processes to drive sustainable growth and competitive advantage?
  • What proactive measures should be in place to address regulatory and compliance requirements, ensuring risk mitigation and adherence to industry best practices?

There are four common MDM design approaches that can be used to update the golden record within the business unit data (see sidebar “Four common master data management design approaches”). Deploying a modular architecture enables fit-for-purpose consumption and integration patterns with various systems to manage the golden record. For example, every mastered client record could be linked back to the source systems and mapped to a hierarchy to show association in the MDM system. Alternatively, client data could be mastered and assigned a unique client ID within the golden record to stitch together data from all systems and create a single portfolio of a client.

Establish a robust data governance model to maintain integrity and reliability of MDM capabilities

Only 29 percent of companies responding to our survey had full upstream and downstream MDM integrations with source systems and business applications, as well as all governance or stewardship roles, in place. Organizations should clearly identify the single source of truth for data and properly train employees on handling integration failures to avoid saving stale information.

Data governance models for MDM should be designed with clear roles and responsibilities, be managed by a governance council with representatives from different business units and IT, and be shepherded by someone who can serve as an MDM liaison among business, data, and technology stakeholders. The structure should be complemented by a clearly defined policy framework and a tailored, business-backed, and IT-supported operating model for master data domains. These data governance processes will allow upstream system owners and a data governance council to address data quality issues—for example, when the MDM identifies new or updated information as conflicting with other information based on the survivorship strategy.

Choose an MDM tool that enhances data quality and accelerates transformation

MDM tools are becoming more intuitive and user-friendly, and recent innovations in AI, ML, cloud technologies, and federated architectures have opened new possibilities for data mastering and processing. For example, AI-enabled tools use pretrained AI and ML models to automate data quality, data matching, and entity resolution tasks with a higher degree of accuracy and greater efficiency. According to the survey, 69 percent of organizations are already using AI as part of their overall data management capabilities; however, only 31 percent are using advanced AI-based techniques to enhance match-and-merge capabilities and to improve master data quality more broadly.

Organizations should choose data management tools that align with their priorities and make the transition seamless. It’s also important to consider the return on investment and the incremental value that each MDM tool can bring to the organization. When choosing an MDM tool, relevant business stakeholders should understand data processes and requirements, including the data elements that affect business operations and the priority use cases, and then help determine the technology capabilities and workflows that are required to integrate new systems.

For example, stakeholders should assess the maturity of their organization’s capabilities, including its data quality, matching, and entity resolution, to determine how easily new systems will be able to integrate with existing systems and technologies. It is also important to consider these systems’ scalability and flexibility to accommodate future growth and evolving data management needs. Moreover, AI and ML capabilities should be considered to help the MDM tool automate tasks to improve data quality.

Plan for capability building and change management

Organizations that implement technology without changing their processes and the way people work with master data may not fully reap the benefits of MDM.

Change management is crucial to ensure that employees understand and embrace the changes brought about by MDM implementation. It typically includes securing executive sponsorship to demonstrate the importance of MDM to the organization; engaging with business and technology stakeholders to communicate the vision; setting expectations for accountability and processes; and rolling out comprehensive training programs to educate employees on MDM and data principles, processes, and tools.

Start with a pilot implementation

Organizations can start integrating MDM tools by first piloting MDM in one domain to validate its design, governance model, and workflows in a controlled environment. Organizations can then easily identify any potential issues or challenges and make the necessary adjustments before scaling up the implementation to other master data domains or to the entire organization. Piloting these tools also allows organizations to gather feedback from users and stakeholders to understand the user experience, identify areas for improvement, and make necessary changes to optimize the MDM tool and workflows.

Implementing and optimizing MDM capabilities can seem daunting, especially for large organizations with multiple complex systems. But once successfully deployed across master data domains—using an optimal design approach, an efficient governance structure, and sufficient change management efforts—MDM can ensure that high-quality data is available for strategic decision making, leading to cost savings and revenue opportunities across an organization.

Aziz Shaikh and Jorge Machado are partners in McKinsey’s New York office, where Kayvaun Rowshankish is a senior partner, Rachit Saxena is a consultant, and Rajat Jain is an associate partner. Holger Harreis is a senior partner in the Düsseldorf office.

The authors wish to thank Vladimir Alekseev for his contributions to this article.

Explore a career with us

Related articles.

illustration corner of digital cube

How to unlock the full value of data? Manage it like a product

Pole vault - stock illustration

Realizing more value from data projects

Abstract background of multi-colored cubes - stock photo

Demystifying data mesh

IMAGES

  1. Data Management Plan

    what is a data management plan in research

  2. A Guide to Research Data Management

    what is a data management plan in research

  3. Data Management Plan

    what is a data management plan in research

  4. Data Management Plan

    what is a data management plan in research

  5. Research data & data management

    what is a data management plan in research

  6. Data Management Plan

    what is a data management plan in research

VIDEO

  1. Records and Data Management Plan 720p 240515

  2. Preparing a Data Management Plan

  3. Data Management Plans

  4. Data Management Plan Creation: Content and Rationale

  5. SNSF Data Management Plan

  6. Developing a Data Management Plan for your research data

COMMENTS

  1. Data Management Plans

    A data management plan, or DMP, is a formal document that outlines how data will be handled during and after a research project. Many funding agencies, especially government sources, require a DMP as part of their application processes. Even if you are not seeking funding for your research, documenting a plan for your research data is a best ...

  2. PDF Complete Guide to Writing Data Management Plans

    For this reason, your plan should describe data management and sharing during your research and, importantly, after your research is complete. Many funding agencies and sponsors require a data management plan with each proposal, but any researcher or team will benefit from developing a data management plan at the beginning of a project.

  3. What Is Data management Plan (DMP)?

    A data management plan (DMP) is a document which defines how data handled throughout the lifecycle of a project—that is, from its acquisition to archival. While these documents are typically used for research projects to meet funder requirements, they can be leveraged within a corporate environment as well to create structure and alignment ...

  4. Write a data management plan

    A data management plan (DMP) will help you manage your data, meet funder requirements, and help others use your data if shared. The DMPTool is a web-based tool that helps you construct data management plans using templates that address specific funder requirements. From within this tool, you can save your plans, access MIT-specific information ...

  5. Data Management Plans

    A Data Management Plan outlines how data will be collected, organized, stored, secured, shared, and preserved in a research project. It covers data collection methods, organization, storage, sharing, preservation, ethics, and researcher responsibilities. Data Management Plans promote transparency and maximize research impact by ensuring your ...

  6. Research Data Management: Plan for Data

    Data management plans (DMPs) are documents that outline how data will be collected, stored, secured, analyzed, disseminated, and preserved over the lifecycle of a research project. They are typically created in the early stages of a project, and they are typically short documents that may evolve over time.

  7. Writing data management plans

    A data management plan (DMP) or data management and sharing plan (DMSP) is a written document that describes: the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyze, and store those data, and. what mechanisms you will use at the end of your project to share and preserve your data.

  8. What is Research Data Management

    Research Data Management is the process of providing the appropriate labeling, storage, and access for data at all stages of a research project ... To address these challenges, funding agencies require a data management or data sharing plan to be submitted with grant applications.

  9. Data Management Plan

    Crafting your data management plan. Most research funders encourage researchers to think about their research data management activities from the beginning of the project. This will often mean a formal plan for managing data (a 'data management plan'). However, even informally setting out your plans and project guidelines can make your life much easier.

  10. Data management made simple

    A data-management plan explains how researchers will handle their data during and after a project, and encompasses creating, sharing and preserving research data of any type, including text ...

  11. What is a research data management plan?

    A research data management plan is a document that describes: Who will be responsible for each of these activities, including contingencies for the data owner or principal investigator leaving the University. The best time to develop your data management plan is at the beginning of your research. Time invested in creating a robust data ...

  12. Research Data Management: Data Management Plan

    A data management plan (DMP) is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyze, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data. You may have already considered some or ...

  13. PDF Essentials of data management: an overview

    Outlining a data management strategy prior to initiation of a research study plays an essential role in ensuring that both scienti c integrity (i.e., data generated can accurately test the fi ...

  14. Ten Simple Rules for Creating a Good Data Management Plan

    Research papers and data products are key outcomes of the science enterprise. Governmental, nongovernmental, and private foundation sponsors of research are increasingly recognizing the value of research data. As a result, most funders now require that sufficiently detailed data management plans be submitted as part of a research proposal.

  15. Data Management Plans

    Data Management Plans. A data management plan (DMP) is a formal document that outlines how a researcher intends to manage their research data during and after a project. Creating a DMP can help you: Make decisions about managing your data and understand the implications of those decisions. Identify resources and tools needed for your research.

  16. 1.1 Data Management Plans (DMP)

    A Data Management Plan (DMP) is a written living document that formally outlines what you will do with your research data during the course of your research project and afterwards. It is a living document because any time your research plans change, you should review your DMP in order to make sure that the plan still satisfies your essential ...

  17. Developing a Data Management Plan

    A good data management plan begins by understanding the sponsor requirements funding your research. As a principal investigator (PI) it is your responsibility to be knowledgeable of sponsors requirements. The Data Management Plan Tool (DMPTool) has been designed to help PIs adhere to sponsor requirements efficiently and effectively.

  18. Writing a data management plan

    A data management plan (DMP) is a formal document that outlines how you will handle your data both during your research, and after the project is completed.This ensures that data are well-managed in the present, and prepared for preservation in the future. A DMP is often required in grant proposals.. A research data management plan is a living document and should be reviewed and updated regularly.

  19. Data Management Planning

    A Data Management Plan (DMP) is a formal document outlining how you will handle data during your research project and after the project is complete. Data management addresses the entire lifecycle of your data--from creation to organization, access, storage, preservation, and distribution. Managing your research data ensures that it will be ...

  20. Writing a Data Management & Sharing Plan

    Data sharing plans should describe how an applicant will share their final research data. The specifics of the plan will vary on a case-by-case basis, depending on the type of data to be shared and how the investigator plans to share the data. Examples of information to cover in a data sharing plan include: The expected schedule for data sharing

  21. Managing data across the research lifecycle

    A Data Management Plan (DMP) describes your planned and/or actioned data management and sharing activities. It is generally 1-3 pages in length and should cover the four phases of the research data lifecycle. It is generally written at the start of a research project and should be revisted at different stages of the project and updated where ...

  22. Data management plans

    A data management plan is a document which outlines how data will be managed throughout the whole project life cycle. Generally a plan covers initial decisions, how data will be handled during the active phase of research, and longer-term questions of preservation and sharing. The plan may be updated and revised as a project develops.

  23. Data Management and Sharing Plans

    A data management plan (DMP) outlines the details of how the data for a research project is handled, during and after the research project. They are required by many funding agencies as part of their application and are considered a best practice for research.

  24. DMPTool

    DMPTool. The DMPTool is web-based and provides basic templates to help you construct a Data Management Plan. Using DMPTool, researchers can access a template, example answers, and guiding resources to successfully write a data management plan for any research project or grant.

  25. Elevating master data management in an organization

    1. MDM plays an important role with modern data architecture concepts and creates value in five ways: MDM cleans, enriches, and standardizes data for key functions, such as customer or product data, before it is loaded into the data lake. In this way, MDM ensures that data is accurate, complete, and consistent across an organization.

  26. Plan your research data management

    Contact our library staff for help, support or to give feedback. Get help and contact your library. Call. +61 3 9244 6200. Email. [email protected]. Submit an enquiry. Give feedback. Preparing a research data management plan will help you make good data management choices.

  27. PDF HRP Data Management Plan

    The purpose of Human Research Program Data Management Plan (DMP) is to define the processes and activities required for the overall management of the research data collected and managed by HRP throughout their life cycle. New updates to the Data Management Plan in 2023 include. 1.

  28. Data management plan

    A data management plan or DMP is a formal document that outlines how data are to be handled both during a research project, and after the project is completed. The goal of a data management plan is to consider the many aspects of data management, metadata generation, data preservation, and analysis before the project begins; this may lead to data being well-managed in the present, [citation ...

  29. Research & Innovation

    That position is expected to be filled soon. The plan also includes a focus on standardizing the university's digital clinical trial management system and streamlining the contracting process. In order to keep the new targets top of mind for everyone on the team, a clinical trials dashboard has been created as well.

  30. The Deloitte Global 2024 Gen Z and Millennial Survey

    Download the 2024 Gen Z and Millennial Report. 5 MB PDF. To learn more about the mental health findings, read the Mental Health Deep Dive. The 13th edition of Deloitte's Gen Z and Millennial Survey connected with nearly 23,000 respondents across 44 countries to track their experiences and expectations at work and in the world more broadly.