Skip to main content

Research Data Management Toolkit: Retention and Preservation

This guide provides information about research data management and the Tropical Data Hub (TDH) Research Data repository

Introduction: Data Retention And Preservation

Preserving data after your research project is critical to:

  • prevent data loss
  • enable long-term access, discovery and re-use
  • ensure researchers and institutions can defend their research outcomes if they are challenged

Preservation activities need to be planned and should take into account file formats and data quality, data ownership, retention periods, preferred data repositories and ways to share data safely

Deciding what research data to retain and/or share can be difficult. Consider:

  • what is needed for validation and re-use
  • how long it would take to collect the data again or if that would even be possible. Take a look at the data type examples below
  • the significance and value of your data. But remember, it is almost impossible for you to anticipate how useful your data may be to other researchers or even your future self!
  • retention periods required under the Code

Types of Data

As noted in the introduction to the Toolkit there are many definitions and types of data. This section looks at some data types and how difficult it would be to replace them - something to consider when deciding what to archive:

Observational data

presence/absence, sensor readings (usually irreplaceable)

Physical data

rock samples, blood samples, plants, interview transcripts, diaries (usually irreplaceable)

JCU logo

JCU staff and students should contact their College, Centre or research unit for advice and to locate JCU repositories (such as the tissue bank) for their physical data. Plan to digitize physical data if you can e.g. microscopy slides, transcripts.

Experimental data

gene sequences, chromatograms (reproducible, but expensive)

Simulation data

climate models (model and its inputs are the most important thing here)

Derived/compiled data

compiled databases (reproducible, but expensive)

In many cases derived data are straightforward to reproduce from your (irreplaceable) raw data as long as a detailed workflow/methodology is also made available. Including the derived data as well is advisable - data can be computationally intensive to reproduce and other researchers may not wish to do so - and derived data may be easier for researchers from other disciplines or the public to understand.

Adapted from: Presentation by Marianne Brown, eResearch Centre, James Cook University - licensed under a Creative Commons 3.0 Australia Licence

Retention Periods

Retention rules are defined by the research funding body or the university. Key documents for JCU researchers include section 2.5 of the JCU Code for the Responsible Conduct of Research and the University Sector Retention and Disposal Schedule for Queensland universities.

In general the minimum period for retention of data is 5 years from the end of the year of publication of the last refereed publication or other form of public release to an audience outside of the University that is based on the data.

However, in any particular case the period for which data should be retained should be determined by the specific type of research e.g. for areas such as gene therapy, research data must be retained permanently.

Rules in respect of specific types of data include:

Research data - clinical trials
Research data created in the conduct of clinical trials.
Retain for 15 years after completion of clinical research/trial AND 10 years after last patient service provision or medico-legal action.
Research data - other (does not result in patent)
Research data created in the conduct of research which does not fit into the other categories, which does not result in a patent.
Retain for 5 years after last action e.g. end of the year of publication of the last refereed publication
Research data - other (results in patent)
Research data created in the conduct of research which does not fit into the other categories, which results in a patent.
Retain for 7 years after expiry of patent (i.e. a minimum of 27 years)

Research data - significant
Research data created in the conduct of a research project, including clinical trials, which is of high public interest or significance to the discipline such that it has or will change a commonly held view or approach irrespective of the field in which the research is conducted. 
Factors which may determine significance include projects which:

  • are controversial
  • are the subject of extensive debate
  • arouse widespread scientific or other interest
  • have the potential to cause major adverse impacts on the environment, society or human health 
  • involve eminent researchers
  • involve the use of major new or innovative techniques.

Retain permanently

JCU HDR students and researchers should archive their completed data in the Tropical Data Hub (TDH) Research Data repository. This includes:

  • completed Higher Degree by Research (HDR) data
  • data associated with publications or funded projects
  • significant data or data with potential re-use value

With rare exceptions, datasets deposited in the Tropical Data Hub (TDH) Research Data repository will be assigned a DOI. Your data is a citable part of the "scholarly record" and will be retained permanently regardless of the recommended retention period.

 

    return to Toolkit Contents

We acknowledge the Australian Aboriginal and Torres Strait Islander peoples as the first inhabitants of the nation and acknowledge Traditional Owners of the lands where our staff and students, live, learn and work.Acknowledgement of Country