Skip to main content

Research Data Management Toolkit: Share Data Safely

This guide provides information about research data management and the Tropical Data Hub (TDH) Research Data repository

Sharing Sensitive Data

What is sensitive data?

The Australian National Data Service (2017) includes a definition of sensitive data in their guide:

Sensitive data identifies individuals, species, objects or locations, and carries a risk of causing discrimination, harm or unwanted attention.

Under law and the research ethics governance of most institutions, sensitive data cannot typically be shared in this form. The Legal and Ethical Framework page of this Toolkit outlines the applicable legislation and guidelines.

Sensitive data is often about people (personal information) but ecological data can also be sensitive if it reveals (for example) the location of rare or rare or endangered species. 

Personal information is sensitive if it directly identifies a person and includes one or more pieces of information from Table 1 (Part I, Division I, Section 6) of the Privacy Act 1988. This includes: racial or ethnic origin | political opinions | membership of a political association | religious beliefs or affiliations | philosophical beliefs | membership of a professional or trade association | membership of a trade union | sexual orientation or practices | criminal record | health information (see section 6FA for definition) | genetic information | biometric information.

Controlled Access Options

  Controlled Access Options

The Tropical Data Hub (TDH) Research Data repository can provide different levels of access to your data:

 

metadata only access icon   1. Metadata Only

 

Most data can be shared (conditional access) or published (open access) but this option is useful for sensitive datasets that cannot be de-identified and highly confidential data. Making metadata available ensures your work is more visible and facilitates discussion/collaboration with other researchers. Data should be stored in the secure, private data section of the Tropical Data Hub archive, to meet data management requirements.


Mediated access icon   2. Metadata + Mediated Access

 

This can be a good option for sharing data that has been de-identified. By making access conditional (other researchers will need to contact the data manager or nominated primary contact) you can ensure requestors are genuine researchers and that they will maintain confidentiality and keep data files secure. You can maintain oversight over who is using the data and for what purpose, and decide if you wish to be a collaborator. 


Open Access icon   3. Metadata + Open Access

 

Data can be downloaded via a link in the Tropical Data Hub (TDH) Research Data repository. Sparc defines open data as research data that can be freely used, reused and redistributed by anyone - subject at most to the requirement to attribute and sharealike. The ideal route is to ensure data is in a machine-readable format on an easily accessible platform with an open licence applied to it. Some licences are more "open" than others. Look at the Know Your Rights: Understanding CC Licences Poster for a good visualization/comparison of the licences.

This option maximizes the visibility and potential impact of your data and may be required by your funder or publisher.


More options

It's worth noting that you:
  • can make different data files (in the same dataset) available under different conditions. Using survey data as a fictional example: raw data with direct identifiers would need to be stored in the secure section of the Tropical Data Hub (option 1), de-identified data might be made available via negotiation (2) and the survey questions and codebook describing data variables could be public (3)
  • can change options even after the dataset has been published. Under certain circumstances you may wish to have restricted or conditional access to your data and then open it up after a nominated period. 
  • should choose a licence for your data (see the Copyright and Licensing section of the Toolkit) if your data is open or available via mediated access only - as a licence will govern use of the data once you grant access to it.

Please contact us to discuss if you have special requirements for controlling access or licensing.

De-identifying Data

Data that has been de-identified no longer triggers the Privacy Act.

Here's an example of sensitive data that has been published as open data. The risk of re-identification via triangulation has been considered and managed and the de-identified dataset can be downloaded from Research Data Australia

The PALS (Pregnancy and Lifestyle Study) contains highly sensitive data. Several techniques have been used to de-identify the dataset e.g. identifiers and dates of birth have been removed, ages have been aggregated into bands - and postcodes have been excluded. It would be possible to re-identify (triangulate) participants by combining (for example) a rural postcode with a rare occupation. 

Think about de-identifying your data early as it can be time consuming and difficult later. Consult the relevant ANDS guides and seek discipline-specific advice as required.

Key Resources from the Australian National Data Service (ANDS)

This section of Toolkit draws heavily on the ANDS guide:

Australian National Data Service. (2017). Publishing and sharing sensitive data. Retrieved from: http://www.ands.org.au/__data/assets/pdf_file/0010/489187/Sensitive-Data-Guide-2018.pdf

ANDS has an extensive range of resources that relate to sensitive data - click on the links below to access guides, videos, posters and other materials on each of these topics:

How to Share Sensitive Data

While sensitive data cannot be published in its original form in almost all cases, it can often be shared using a combination of:

  1. informed consent 
  2. data de-identification
  3. controlled access

        key opening puzzle graphic

Image courtesy of Stuart Miles at FreeDigitalPhotos.net

Consent

Consent is required from human participants before data can be collected or published.

Obtaining informed consent to facilitate data publication and sharing involves:

  • including information about maintaining confidentiality, data publication and sharing in the information sheet (to be approved by the HREC) so participants can make an informed decision before consenting to participate
  • stating the possibility of future data publication and sharing, conditions for access and de-identification processes in consent forms - also to be approved by the HREC

ANDS (2017) provides some example sentences in their guide (pages 14-15) - the examples listed below are appropriate in different contexts e.g. when publishing or when sharing data via mediated access.

"I agree that research data gathered for the study may be published provided my name and other identifying information is not used"  or "other genuine researchers [may] have access to this data only if they agree to preserve the confidentiality of the information as requested in this form"

If consent for sharing is not obtained at the time of the study it may be possible to seek a waiver from reviewers or to go back to participants for additional consent.

The National Statement on Ethical Conduct in Human Research (page 36) raises the ethical issue of obtaining consent for secondary use of data or information. It is (for example) usually impractical to obtain consent for secondary use of data routinely collected during delivery of a service and respect for participants needs to be demonstrated in other ways.

Sharing existing data without explicit consent is a possibility if all of the following conditions are met e.g. it is no longer possible or practical to gain consent  + data has been de-identified + process of de-identification matches the definition in the Privacy Act + there is no risk that publishing or sharing the data will cause harm or discrimination + information sheets and consent forms from the original data collection didn't preclude sharing.

JCU HDR students and researchers should always consult their Human Ethics advisor and HREC for specific advice.


De-identificaiton and Controlled Access

De-identification (triangulation and data linkage) are discussed further on this page.

Note the subtle difference between data sharing and publishing. It may be preferable to share de-identified data via mediated access without publishing it online. Look at the options for controlling access available in the Tropical Data Hub (TDH) Research Data repository for more information.

Triangulation, Data Linkage and Integrating Authorities

It's very important for data owners and managers to be aware that data that is not obviously sensitive (no names or dates of birth for example) or that has been de-identified, can become sensitive through triangulation or data linkage.

Triangulation in this context is the process of combining several pieces of non-sensitive information (in the same dataset) to determine the identity or sensitivity of a participant or subject.

Data linkage combines one or more datasets that include the same participant or subject, an activity that carries the risk of re-identification and may place subjects at risk. Data linkage is highly useful (it increases understanding without having to collect new data and derives greater value from existing datasets) and is increasingly common in epidemiology, medical, social and ecological sciences.

Researchers should treat the new, linked dataset as an identifiable dataset and assess the risks involved.

High risk data integration projects involving information from Australian, state or territory governments will need to be managed by an accredited integrating authority such as the Australian Institute of Health and Welfare (AIHW), Australian Institute of Family Studies (AIFS) or the Australian Bureau of Statistics (ABS) to ensure security. Once data is linked researchers will access it through a secure data lab in Canberra, a mobile data lab, a remote access computing environment or other secure arrangement and output and use of data will be monitored. The AIHW has useful information on data linkage on their website, including an overview of a typical data integration project.

    return to Toolkit Contents

We acknowledge the Australian Aboriginal and Torres Strait Islander peoples as the first inhabitants of the nation and acknowledge Traditional Owners of the lands where our staff and students, live, learn and work.Acknowledgement of Country