Broadly, sensitive data is information that could potentially impact on the rights of others. The Australian National Data Service includes the following definition of sensitive data in their guide:
Sensitive data identifies individuals, species, objects or locations, and carries a risk of causing discrimination, harm or unwanted attention.
Sensitive data is often about people (i.e., personal information) but ecological data can also be sensitive if it reveals (for example) the location of rare or rare or endangered species. Under law and the research ethics governance of most institutions, sensitive data cannot typically be shared in this form. The Legal and Ethical Framework section of this Toolkit outlines the applicable legislation and guidelines here.
Personal information is sensitive if it directly identifies a person and includes one or more pieces of information from Table 1 (Part I, Division I, Section 6) of the Privacy Act 1988. This information includes:
While sensitive data cannot be published in its original form (in almost all cases), it can often be shared using a combination of:
(Image courtesy of Stuart Miles at FreeDigitalPhotos.net.)
The ANDS Guide: Publishing and sharing sensitive data defines data publication and sharing: "Publication occurs when data are made public. This includes having a publicly-available description of the data and access, or information about conditional access, to the data itself. Data sharing occurs when data are made available to others, but does not always accompany publication (e.g. when data are shared among colleagues but not publicly discoverable or available)."
Look at the options for controlling access available in Research Data JCU for more information about open and conditional access.
Consent is required from human participants before data can be collected or published. Obtaining informed consent to facilitate data sharing and publication involves:
The Australian National Data Service provides some example sentences in their guide (pp. 14-15) - the examples listed below are appropriate in different contexts (e.g. open and conditional access):
I agree that research data gathered for the study may be published provided my name and other identifying information is not used
other genuine researchers [may] have access to this data only if they agree to preserve the confidentiality of the information as requested in this form
If explicit consent for sharing is not obtained at the time of the study, it may be possible to seek a waiver from reviewers or to go back to participants for additional consent.
The National Statement on Ethical Conduct in Human Research (p. 36) raises the ethical issue of obtaining consent for secondary use of data or information. It is (for example) usually impractical to obtain consent for secondary use of data routinely collected during delivery of a service and respect for participants needs to be demonstrated in other ways.
Sharing existing data without explicit consent is a possibility if all of the following conditions are met: i.e.
JCU researchers and HDR candidates should always consult their College / Centre Human Ethics advisor, and the JCU Connect Ethics and Research Integrity team for specific advice.
Data that has been de-identified no longer triggers the Privacy Act.
Here's an example of sensitive data that has been published as open data. The risk of re-identification via triangulation has been considered and managed and the de-identified dataset can be downloaded from Research Data Australia.
The PALS (Pregnancy and Lifestyle Study) contains highly sensitive data. Several techniques have been used to de-identify the dataset e.g. identifiers and dates of birth have been removed, ages have been aggregated into bands - and postcodes have been excluded. It would be possible to re-identify (triangulate) participants by combining (for example) a rural postcode with a rare occupation.
Think about de-identifying your data early as it can be time consuming and difficult later. Consult the relevant ANDS guides and seek discipline-specific advice as required.
Research Data JCU - Controlled Access Options
Research Data JCU can provide different levels of access to your data:
1. Metadata Only
Most data can be published via open or conditional access but this option is useful for sensitive datasets that cannot be de-identified and for highly confidential data. Making metadata available via a Data Publication ensures your work is more visible and facilitates discussion/collaboration with other researchers but is optional (you should complete an archival Data Record for data governance purposes)
2. Metadata + Conditional Access
This can be a good option for sharing sensitive data that has been de-identified. By making access conditional (other researchers will need to contact the data manager or nominated primary contact) you can ensure requestors are genuine researchers and that they will maintain confidentiality and keep data files secure. You can maintain oversight over who is using the data and for what purpose, and decide if you wish to be a collaborator.
3. Metadata + Open Access
Data can be downloaded via a link Research Data JCU. Sparc defines open data as research data that can be freely used, reused and redistributed by anyone - subject at most to the requirement to attribute and share alike. The ideal route is to ensure data is in a machine-readable format on an easily accessible platform with an open licence applied to it. Some licences are more "open" than others. Look at the Know Your Rights: Understanding CC Licences Poster for a good visualization/comparison of the licences.
This option maximizes the visibility and potential impact of your data and may be required by your funder or publisher.
You should choose a licence for your data (see the Copyright and Licensing section of the Toolkit) if your data is open and if it is available via conditional access only - as a licence will govern use of the data if you grant access to it.
Please contact us to discuss if you have special requirements for controlling access or licensing.
It's very important for data owners and managers to be aware that data that is not obviously sensitive (no names or dates of birth for example) or that has been de-identified, can become sensitive through triangulation or data linkage.
Triangulation in this context is the process of combining several pieces of non-sensitive information (in the same dataset) to determine the identity or sensitivity of a participant or subject.
Data linkage combines one or more datasets that include the same participant or subject, an activity that carries the risk of re-identification and may place subjects at risk. Data linkage is highly useful (it increases understanding without having to collect new data and derives greater value from existing datasets) and is increasingly common in epidemiology, medical, social and ecological sciences.
Researchers should treat the new, linked dataset as an identifiable dataset and assess the risks involved.
High risk data integration projects involving information from Australian, state or territory governments will need to be managed by an accredited integrating authority such as the Australian Institute of Health and Welfare (AIHW), Australian Institute of Family Studies (AIFS) or the Australian Bureau of Statistics (ABS) to ensure security. Once data is linked researchers will access it through a secure data lab in Canberra, a mobile data lab, a remote access computing environment or other secure arrangement and output and use of data will be monitored. The AIHW has useful information on data linkage on their website, including an overview of a typical data integration project.
This section of Toolkit draws heavily on the ANDS guide:
Australian National Data Service. (2017). Publishing and sharing sensitive data. Retrieved from: http://www.ands.org.au/__data/assets/pdf_file/0010/489187/Sensitive-Data-Guide-2018.pdf
ANDS has an extensive range of resources that relate to sensitive data - click on the links below to access guides, videos, posters and other materials on each of these topics:
NB. On 1 July 2018, ANDS, Nectar and RDS combined to form the Australian Research Data Commons (ARDC)
We acknowledge the Australian Aboriginal and Torres Strait Islander peoples as the first inhabitants of the nation and acknowledge Traditional Owners of the lands where our staff and students, live, learn and work.