Zobrazeno 1 - 10
of 96
pro vyhledávání: '"Shokri, Reza"'
Prior Membership Inference Attacks (MIAs) on pre-trained Large Language Models (LLMs), adapted from classification model attacks, fail due to ignoring the generative process of LLMs across token sequences. In this paper, we present a novel attack tha
Externí odkaz:
http://arxiv.org/abs/2409.13745
Autor:
Tao, Jiashu, Shokri, Reza
Machine learning models can leak private information about their training data, but the standard methods to measure this risk, based on membership inference attacks (MIAs), have a major limitation. They only check if a given data point \textit{exactl
Externí odkaz:
http://arxiv.org/abs/2408.05131
Watermarking is a technique used to embed a hidden signal in the probability distribution of text generated by large language models (LLMs), enabling attribution of the text to the originating model. We introduce smoothing attacks and show that exist
Externí odkaz:
http://arxiv.org/abs/2407.14206
The principle of data minimization aims to reduce the amount of data collected, processed or retained to minimize the potential for misuse, unauthorized access, or data breaches. Rooted in privacy-by-design principles, data minimization has been endo
Externí odkaz:
http://arxiv.org/abs/2405.19471
Membership inference attacks aim to detect if a particular data point was used in training a model. We design a novel statistical test to perform robust membership inference attacks (RMIA) with low computational overhead. We achieve this by a fine-gr
Externí odkaz:
http://arxiv.org/abs/2312.03262
We analytically investigate how over-parameterization of models in randomized machine learning algorithms impacts the information leakage about their training data. Specifically, we prove a privacy bound for the KL divergence between model distributi
Externí odkaz:
http://arxiv.org/abs/2310.20579
Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential pri
Externí odkaz:
http://arxiv.org/abs/2310.19973
Autor:
Mireshghallah, Niloofar, Kim, Hyunwoo, Zhou, Xuhui, Tsvetkov, Yulia, Sap, Maarten, Shokri, Reza, Choi, Yejin
The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.) introduces a new set of inference-time privacy risks: LLMs are fed different types of information from multiple sources in their inputs and are expected to rea
Externí odkaz:
http://arxiv.org/abs/2310.17884
We introduce an analytical framework to quantify the changes in a machine learning algorithm's output distribution following the inclusion of a few data points in its training set, a notion we define as leave-one-out distinguishability (LOOD). This i
Externí odkaz:
http://arxiv.org/abs/2309.17310
Repeated parameter sharing in federated learning causes significant information leakage about private data, thus defeating its main purpose: data privacy. Mitigating the risk of this information leakage, using state of the art differentially private
Externí odkaz:
http://arxiv.org/abs/2309.05505