Zobrazeno 1 - 10
of 25
pro vyhledávání: '"Chen, Dingfan"'
Despite being prevalent in the general field of Natural Language Processing (NLP), pre-trained language models inherently carry privacy and copyright concerns due to their nature of training on large-scale web-scraped data. In this paper, we pioneer
Externí odkaz:
http://arxiv.org/abs/2408.11046
Autor:
Zhu, Derui, Chen, Dingfan, Li, Qing, Chen, Zongxiong, Ma, Lei, Grossklags, Jens, Fritz, Mario
Despite tremendous advancements in large language models (LLMs) over recent years, a notably urgent challenge for their practical deployment is the phenomenon of hallucination, where the model fabricates facts and produces non-factual statements. In
Externí odkaz:
http://arxiv.org/abs/2404.04722
Autor:
Chen, Dingfan, Oestreich, Marie, Afonja, Tejumade, Kerkouche, Raouf, Becker, Matthias, Fritz, Mario
Publikováno v:
Proceedings on Privacy Enhancing Technologies (PoPETs 2024)
Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications. Existing literature, however, primarily focuses on basic benchmarking datasets and tends to re
Externí odkaz:
http://arxiv.org/abs/2402.04912
The availability of rich and vast data sources has greatly advanced machine learning applications in various domains. However, data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overco
Externí odkaz:
http://arxiv.org/abs/2309.15696
The potential of realistic and useful synthetic data is significant. However, current evaluation methods for synthetic tabular data generation predominantly focus on downstream task usefulness, often neglecting the importance of statistical propertie
Externí odkaz:
http://arxiv.org/abs/2307.07997
In recent years, diffusion models have achieved tremendous success in the field of image generation, becoming the stateof-the-art technology for AI-based image processing applications. Despite the numerous benefits brought by recent advances in diffu
Externí odkaz:
http://arxiv.org/abs/2302.07801
Conventional gradient-sharing approaches for federated learning (FL), such as FedAvg, rely on aggregation of local models and often face performance degradation under differential privacy (DP) mechanisms or data heterogeneity, which can be attributed
Externí odkaz:
http://arxiv.org/abs/2302.01068
Publikováno v:
36th Conference on Neural Information Processing Systems (NeurIPS 2022)
Differentially private data generation techniques have become a promising solution to the data privacy challenge -- it enables sharing of data while complying with rigorous privacy guarantees, which is essential for scientific progress in sensitive d
Externí odkaz:
http://arxiv.org/abs/2211.04446
Publikováno v:
International Conference on Learning Representations 2022
As a long-term threat to the privacy of training data, membership inference attacks (MIAs) emerge ubiquitously in machine learning models. Existing works evidence strong connection between the distinguishability of the training and testing loss distr
Externí odkaz:
http://arxiv.org/abs/2207.05801
Over the past years, deep generative models have achieved a new level of performance. Generated data has become difficult, if not impossible, to be distinguished from real data. While there are plenty of use cases that benefit from this technology, t
Externí odkaz:
http://arxiv.org/abs/2012.08726