De-identifying data
What de-identification is, and how to de-identify datasets.
What is de-identification?
De-identification removes information to allow data to be used without the possibility of individuals being identified.
Data de-identification may be used to protect the privacy of individuals and organisations or, for example, to ensure that the spatial locations of minerals, archaeological findings, or endangered species are not publicly available.
Degrees of identification in data
The definitions and examples below from data.govt.nz explain the difference between identifiable, de-identified and confidentialised information.
Identifiable
Data that directly or indirectly identifies an individual or business.
Examples
Individual:
Name: Hēni
Gender: Female
Date of birth: 31/01/1985
Address: 28 My Road, Postcode 6012, Wellington
Business:
Name: Puzzles
Type: Paper stationery manufacturing
Employees: 34
Expenditure: $398,000
De-identified
Data which has had information removed from it to reduce the risk of spontaneous recognition.
Examples
Individual:
Name: Unknown
Gender: Female
Date of birth: 1985
Address: Postcode 6012, Wellington
Business:
Name: Unknown
Type: Manufacturing
Employees: 30-40
Expenditure: $398,000
Confidentialised
Data that has had statistical methods applied to it to protect against disclosing unauthorised information.
Examples
Individual:
Name: Unknown
Gender: Female
Age: 30-40 years
Address: Wellington
Business:
Name: Unknown
Type: Manufacturing
Employees: 10-100
Expenditure: Under $500,000
Resources
For practical guidance for de-identification, dealing with different types of data (e.g. qualitative, audio-visual), and management of identifiable data, use the external resources below:
- Data identifiability section of 12. Health Data, National Ethics Advisory Committee (NEAC)
- Data confidentiality principles and methods, Data.govt.nz
- Identifiable data, Australian Research Data Commons (ARDC)
- Publishing sensitive data guide, Australian Research Data Commons (ARDC)
- A Visual Guide to Practical Data De-Identification, Future of Privacy Forum
Contact
Research Data Support Services
Email: researchdata@auckland.ac.nz
eResearch Engagement Lead
Email: Laura Armstrong