Autor: |
Yangdi Jiang, Lucy Mosquera, Bei Jiang, Linglong Kong, Khaled El Emam |
Jazyk: |
angličtina |
Rok vydání: |
2022 |
Předmět: |
|
Zdroj: |
PLoS ONE, Vol 17, Iss 6 (2022) |
Druh dokumentu: |
article |
ISSN: |
1932-6203 |
Popis: |
Background One common way to share health data for secondary analysis while meeting increasingly strict privacy regulations is to de-identify it. To demonstrate that the risk of re-identification is acceptably low, re-identification risk metrics are used. There is a dearth of good risk estimators modeling the attack scenario where an adversary selects a record from the microdata sample and attempts to match it with individuals in the population. Objectives Develop an accurate risk estimator for the sample-to-population attack. Methods A type of estimator based on creating a synthetic variant of a population dataset was developed to estimate the re-identification risk for an adversary performing a sample-to-population attack. The accuracy of the estimator was evaluated through a simulation on four different datasets in terms of estimation error. Two estimators were considered, a Gaussian copula and a d-vine copula. They were compared against three other estimators proposed in the literature. Results Taking the average of the two copula estimates consistently had a median error below 0.05 across all sampling fractions and true risk values. This was significantly more accurate than existing methods. A sensitivity analysis of the estimator accuracy based on variation in input parameter accuracy provides further application guidance. The estimator was then used to assess re-identification risk and de-identify a large Ontario COVID-19 behavioral survey dataset. Conclusions The average of two copula estimators consistently provides the most accurate re-identification risk estimate and can serve as a good basis for managing privacy risks when data are de-identified and shared. |
Databáze: |
Directory of Open Access Journals |
Externí odkaz: |
|
Nepřihlášeným uživatelům se plný text nezobrazuje |
K zobrazení výsledku je třeba se přihlásit.
|