Statistical Tradeoffs between Generalization and Suppression in the De-identification of Large-Scale Data Sets

Autor:	Olivia Angiuli, James H. Waldo
Rok vydání:	2016
Předmět:	0301 basic medicine Information privacy Computer science Privacy software Generalization De-identification 02 engineering and technology computer.software_genre Data set Set (abstract data type) 03 medical and health sciences 030104 developmental biology 020204 information systems 0202 electrical engineering electronic engineering information engineering Data mining computer Private information retrieval Anonymity
Zdroj:	2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC).
Popis:	Data sets containing private information about individuals must satisfy privacy standards before being publicly released. One such standard, k-anonymity, reduces the probability of the re-identification of individuals by requiring that rare combinations of personally-identifiable information be represented by at least k distinct individuals. Records that violate this standard must be altered, which can lead to significant distortion of the statistical properties of the data set. In this paper, we discuss improvements to two techniques used to achieve k-anonymity, generalization and suppression, that confer k-anonymity while better preserving the statistical properties of an educational data set taken from a massive online open course platform, edX.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::4a17fde70c3169b667e3e6ca71e1f8a7 https://doi.org/10.1109/compsac.2016.198 Zobrazit plný text záznamu