Proportional representation to increase data utility in k-anonymous tables

Autor: Viton, Fabien, Mauger, Clémence, Dequen, Gilles, Guérin, Jean-Luc, Le Mahec, Gaël
Přispěvatelé: Modélisation, Information et Systèmes - UR UPJV 4290 (MIS), Université de Picardie Jules Verne (UPJV)
Rok vydání: 2021
Předmět:
Zdroj: 26th IEEE Symposium on Computers and Communications
26th IEEE Symposium on Computers and Communications, Sep 2021, Athènes, Greece
DOI: 10.1109/iscc53001.2021.9631457
Popis: International audience; The increasing number of published data has allowed the development of data mining, resting on the use of the data to extract knowledge. At the same time, to tackle privacy concerns, anonymization models such as k-anonymity have emerged. Because k-anonymity transforms original data, there is an impact on the utility of altered data for data mining. In this paper, we propose a new writing of the anonymous tables using an anonymization post-treatment. The proposed representation allows to keep more information on the distribution of the original values in the anonymous equivalence classes while being usable directly as input for neural networks for data mining purposes. We test our experimental protocol on two data sets from anonymization research field: Adult data set and an extract from the register of voters of Florida (USA). With these experiments, we show the superiority in data utility of our approach against classical approaches.
Databáze: OpenAIRE