Differentially Private Nearest Neighbor Classification
Autor: | Yucel Saygin, Mehmet Emre Gursoy, Ali Inan, Mehmet Ercan Nergiz |
---|---|
Rok vydání: | 2017 |
Předmět: |
Computer Networks and Communications
Computer science business.industry Privacy protection 02 engineering and technology Machine learning computer.software_genre Computer Science Applications k-nearest neighbors algorithm Random subspace method Information sensitivity Statistical classification ComputingMethodologies_PATTERNRECOGNITION 020204 information systems 0202 electrical engineering electronic engineering information engineering Differential privacy 020201 artificial intelligence & image processing Data mining Artificial intelligence business Classifier (UML) computer Information Systems |
Popis: | Instance-based learning, and the k-nearest neighbors algorithm (k-NN) in particular, provide simple yet effective classification algorithms for data mining. Classifiers are often executed on sensitive information such as medical or personal data. Differential privacy has recently emerged as the accepted standard for privacy protection in sensitive data. However, straightforward applications of differential privacy to k-NN classification yield rather inaccurate results. Motivated by this, we develop algorithms to increase the accuracy of private instance-based classification. We first describe the radius neighbors classifier (r-N) and show that its accuracy under differential privacy can be greatly improved by a non-trivial sensitivity analysis. Then, for k-NN classification, we build algorithms that convert k-NN classifiers to r-N classifiers. We experimentally evaluate the accuracy of both classifiers using various datasets. Experiments show that our proposed classifiers significantly outperform baseline private classifiers (i.e., straightforward applications of differential privacy) and executing the classifiers on a dataset published using differential privacy. In addition, the accuracy of our proposed k-NN classifiers are at least comparable to, and in many cases better than, the other differentially private machine learning techniques. |
Databáze: | OpenAIRE |
Externí odkaz: |