About the workshop
De-identification is the process of removing or masking information from a dataset that could be used to personally identify an individual. This process is fundamental in enabling the sharing and re-use of data for secondary research purposes. The possibility of individual identification from given data is determined by disclosure risk, and this risk is an important consideration when collecting, analysing, and sharing research data. De-identification can balance the risk of disclosure with the increased research value of a shared dataset.
This workshop will touch on issues related to sharing sensitive data and offers practical suggestions on how such data can be made ready for re-use. Topics include how to assess disclosure risk, direct and indirect identifiers, risk thresholds and measurement, and how to reduce disclosure risk in various academic disciplines with techniques such as generalization, suppression, and subsampling.