Differential privacy and generalization: Sharper bounds with applications
Autor: | Sandro Ridella, Davide Anguita, Luca Oneto |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2017 |
Předmět: |
Randomized classifier
02 engineering and technology 010501 environmental sciences Machine learning computer.software_genre Gibbs classifier Model selection 01 natural sciences Type generalization Error estimation Artificial Intelligence 0202 electrical engineering electronic engineering information engineering Differential privacy Chernoff type bound Data dependent 0105 earth and related environmental sciences Mathematics Thresholdout business.industry Multiplicative function Bennett type bounds Randomized algorithm Catoni prior and posterior Generalization bound Software Signal Processing 1707 020201 artificial intelligence & image processing Computer Vision and Pattern Recognition Artificial intelligence business Classifier (UML) computer Algorithm |
Popis: | We derive new Chernoff and Bennett type risk bounds based on Differential Privacy(DP).CDP, a randomized learning algorithm based on the Catonis work, is DP.CDP has better generalization properties than the Catoni based Gibbs classifier.We discuss the use of Thresholdout for model selection and error estimation purposes.We improve the risk bound for Thresholdout. In this paper we deal with the problem of improving the recent milestone results on the estimation of the generalization capability of a randomized learning algorithm based on Differential Privacy (DP). In particular, we derive new DP based multiplicative Chernoff and Bennett type generalization bounds, which improve over the current state-of-the-art Hoeffding type bound. Then, we prove that a randomized algorithm based on the data generating dependent prior and data dependent posterior Boltzmann distributions of Catoni (2007) [10] is Differentially Private and shows better generalization properties than the Gibbs classifier associated to the same distributions. With this aim, we also exploit a simple example. Finally, we discuss the advantages of using the Thresholdout procedure, one of the main results generated by the DP theory, for Model Selection and Error Estimation purposes, and we derive a new result which exploits our new generalization bounds. |
Databáze: | OpenAIRE |
Externí odkaz: |