Skip to Main Content
Monash Health Library
If you have an exact title add ti= (eg. ti=Emergency Medicine)

Research Data Management GuideClick here to chat with a librarian

Research Data Management (RDM) is the practice of managing, organising and preserving all of the information used to produce research, from the initial planning and searching through to post-publication. Navigate through the blue tabs above to learn more. This data comprises of a range of records such as notes, spreadsheets, surveys, emails, published material and grey literature.

Preservation

Data preservation - a series of managed activities that ensure continued access for as long as necessary - is a top priority when planning new research. It is important that you preserve your research data from commencement of your project to allow for long term preservation once the research has been completed. 

Funders, institutions and publishers will have strict requirements specifying how data should be preserved long-term. It is important to check these requirements on acceptance of funding.

Values underpinning the preservation of data are:

  • Unique data should be stored in such a way it cannot be replaced or replicated
  • Data should be verified by the researchers as authoritative and correct to support sound research
  • Legal requirements are complied with, eg. copyright

Short-term preservation of data during research will ensure that the data is safe, accessible and protected against any loss. 

To prepare for this, your Data Management Plan (also covered in this toolkit) should include details of and instruction related to:

  1. Data Backups - when they will be scheduled, who will be responsible for them and how they will be stored
  2. Data Security - setting of appropriate access controls to your data especially when multiple researchers are involved
  3. Data Findability - planning for your data to be found and understood, eg. a plan for unique document identifiers, the metadata you will record, whether the collection will be indexed and/or searchable.

Long-term preservation involves the data being submitted to an appropriate repository for storage after the research is complete, ensuring both its security, accessibility and findability. See the 'deposit' tab for recommended repositories.

If some data is to be made publicly available then it is important that the data is findable and reusable. It is important that appropriate data description and citation is applied to the data as well as DOIs (Digital Object Identifiers) to datasets.

Deposit of Research Data

When a research project has been completed an appropriate archive or repository needs to be selected for storage of the research data.

Prior to deposit, data should be prepared. This may involve cleaning the data and ensuring it is in an appropriate format, extending from survey data for example through to documentation and citation information. Some repositories will provide assistance with preparing the data for deposit and other data management tasks.

The choice of repository will depend on the data type and research discipline. The following finding tools will assist with this decision.


Repository Finding Tools

  • Databib.org - a free annotated bibliography and catalogue of research data repositories
  • FAIRsharing.org - identify repositories that are available for specific data or discipline
  • GHDx - Global Health Data Exchange is a data catalogue of surveys, censuses, statistics and other public health and global health data
  • re3data.org - Registry of Research Data Repositories is an open science tool that offers an overview of international repositories
  • Repository Finder - a tool hosted by DataCite to query the re3data repository for repositories relevant to FAIRsFAIR Project

Repositories

The following list provides a selection of significant research data repositories.

  • The Australian Data Archive (ADA) : a national service for the collection and preservation of digital research data.
  • Clinical Study Data Request : facilitates the sharing of patient level data from clinical study sponsors and funders.
  • DANS : Data Archiving and Networked Services to deposit research data, search for datasets and research projects and provide education on RDM.
  • Dryad : nonprofit membership organisation who assess files for quality control and ensure best practice is followed.
  • Figshare : a flexible open access repository where any file format may be uploaded and shared either privately or made public
  • Harvard Dataverse : repository which includes medicine, health and life sciences.
  • Health and Medical Care Archive : (HMCA) preserves and disseminates data collected by health and healthcare research projects funded by the Robert Wood Johnson Foundation (RWJF).
  • Oracle Healthcare Data Repository : repository that supports the exchange of healthcare related information.
  • Physionet : established under the National Institutes of Health (NIH) PhysioNet offers free access to large collections of physiological and clinical data and related open-source software.
  • Sicas Medical Image Repository : Swiss based SICAS acquires and stores medical images and processes data for research and applications in medicine.
  • WHO Global Health Observatory Data Repository : provides access to over 1,000 indicators on priority health topics

Formats for Data Preservation

File formats need to be considered and decided upon before data collection commences. While they are usually dictated by the software you use it is often possible to opt for more than one format, for example .csv or .xls for a spreadsheet.

When settling on file formats for your data is is important to bear in mind that:

  • that file formats can become obsolete - where possible retain multiple formats to reduce risk of loss;
  • the longevity of compatible software and hardware.

Examples of recommended file formats with universal application are:

Data types

Formats

Tabular data

Comma-separated values (.csv)
Tab-delimited (.tab)
SPSS portable format (.por)

Textual data

Rich Text Format (.rtf)
Plain text, ASCII (.txt)
eXtensible Mark-up Language (.xml)

Image data

TIFF (.tif)

Video data

MPEG-4 (.mp4)

Documentation and scripts

Rich Text Format (.rtf)
PDF/UA, PDF/A or PDF (.pdf)

 


Authoritative guidance:

The following organisations provide guidance on preservation formats and best practice for data storage.

Retention guidance

  1. Start with a digital preservation strategy in your Data Management Plan (DMP)
  2. Think long term from the beginning
  3. Decide on where your data will be stored in both the short and long term - cloud? repository? secure network?
  4. Have a system for organising and saving data that includes unique identifiers and metadata
  5. Chose a suitable storage system that includes back ups
  6. Ensure you use durable file formats, software and hardware

Documentation

When retaining research data, it is important to document your decisions to assist with re-use. Documentation should cover how you captured data during your research, metadata applied, the software used for storage and analysis and the file formats that were selected.

In addition, it is wise to store a copy of any specific software used to help cover future software changes.

Good retention practice

The following practises will ensure your data is preserved and available to future researchers.

  • Copy data files to new media every 2 to 5 years
  • Check the data integrity of stored files at regular defined times
  • Have a storage strategy that includes two different forms of storage to safeguard against loss
  • Digitise any paper documents
  • Ensure the storage environment is suitable and fit for the purpose
  • Have appropriate security in place for physical, cloud based and repository systems and audit user access periodically

The UK Digital Curation Centre also provides guidance on retention, what to keep and what to delete.

Monash Health acknowledges the Traditional Custodians of the land, the Wurundjeri and Boonwurrung peoples, and we pay our respects to them, their culture and their Elders past, present and future.

We are committed to creating a safe and welcoming environment that embraces all backgrounds, cultures, sexualities, genders and abilities.