Scalable local recoding anonymization to preserve privacy in big data mining.

Autor: Bhuvaneswari, E., Kalaiselvi, R., Devi, K. Rama, Tummala, Rama Krishna, Shanthi, G.
Předmět:
Zdroj: AIP Conference Proceedings; 2024, Vol. 2742 Issue 1, p1-6, 6p
Abstrakt: Nowadays, the massive amount of are data produced in various sectors like health care, insurance, banking stock market, etc. The dissemination of the data for mining and analysis enables to gain valuable knowledge to have remarkable social and economic development. The greatest challenge in various big data applications is privacy. The sensitive information may cause financial loss or reputation of the individual. Hence, various techniques are proposed to guard against the privacy leak. We propose a novel extremely scalable hybrid LSH-CBSAA method to anonymize big dataset to provide data privacy. There are two phases in our proposed method. First phase divides the given original dataset into smaller units using LSH. Min-Hash function is engaged to split datasets into several divisions to parallelize computation. In the second phase, CBSAA (Constraint Based Sensitive Attribute Anonymization) is employed to guard the sensitive attribute values of the dataset that is published against the privacy attacks. CBSAA incorporates a new constraint namely Sensitivity-diversity. It also handles datasets with multiple sensitive attributes. The method produces the anonymized dataset with maximized data utility and minimized delay time. Thus, the utility of the published dataset is sustained to give precise results in data mining and analysis tasks. The empirical results illustrate show that this method is invulnerable to various privacy attacks. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index