You are here

Data De-Identification


March 20, 2020 update: While the physical SFU Libraries are temporarily closed due to COVID-19 measures, we are happy to continue to support you with any research data management questions remotely. Please feel free to contact us by email at during this time, and we can set up a meeting via video conferencing or telephone.


De-identification is the process of removing or masking information from a dataset that could be used to personally identify an individual. This process is fundamental in enabling the sharing and re-use of data for secondary research purposes. The possibility of individual identification from given data is determined by disclosure risk, and this risk is an important consideration when collecting, analysing, and sharing research data. De-identification can balance the risk of disclosure with the increased research value of a shared dataset.

This workshop will touch on issues related to sharing sensitive data and offers practical suggestions on how such data can be made ready for re-use. Topics include how to assess disclosure risk, direct and indirect identifiers, risk thresholds and measurement, and how to reduce disclosure risk in various academic disciplines with techniques such as generalization, suppression, and subsampling. 

Register for upcoming workshops

No upcoming instances of this workshop found.