Zobrazeno 1 - 10
of 55
pro vyhledávání: '"Mireshghallah, Fatemehsadat"'
Autor:
Zhang, Mengke, He, Tianxing, Wang, Tianle, Mi, Lu, Mireshghallah, Fatemehsadat, Chen, Binyi, Wang, Hao, Tsvetkov, Yulia
In the current user-server interaction paradigm of prompted generation with large language models (LLM) on cloud, the server fully controls the generation process, which leaves zero options for users who want to keep the generated text to themselves.
Externí odkaz:
http://arxiv.org/abs/2309.17157
Autor:
Tang, Xinyu, Shin, Richard, Inan, Huseyin A., Manoel, Andre, Mireshghallah, Fatemehsadat, Lin, Zinan, Gopi, Sivakanth, Kulkarni, Janardhan, Sim, Robert
We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt. We propose a novel algorithm
Externí odkaz:
http://arxiv.org/abs/2309.11765
Autor:
Mattern, Justus, Mireshghallah, Fatemehsadat, Jin, Zhijing, Schölkopf, Bernhard, Sachan, Mrinmaya, Berg-Kirkpatrick, Taylor
Membership Inference attacks (MIAs) aim to predict whether a data sample was present in the training data of a machine learning model or not, and are widely used for assessing the privacy risks of language models. Most existing attacks rely on the ob
Externí odkaz:
http://arxiv.org/abs/2305.18462
LLM-powered chatbots are becoming widely adopted in applications such as healthcare, personal assistants, industry hiring decisions, etc. In many of these cases, chatbots are fed sensitive, personal information in their prompts, as samples for in-con
Externí odkaz:
http://arxiv.org/abs/2305.15008
Task-oriented dialogue systems often assist users with personal or confidential matters. For this reason, the developers of such a system are generally prohibited from observing actual usage. So how can they know where the system is failing and needs
Externí odkaz:
http://arxiv.org/abs/2212.10520
Autor:
Mireshghallah, Fatemehsadat, Vogler, Nikolai, He, Junxian, Florez, Omar, El-Kishky, Ahmed, Berg-Kirkpatrick, Taylor
User-generated social media data is constantly changing as new trends influence online discussion and personal information is deleted due to privacy concerns. However, most current NLP models are static and rely on fixed training data, which means th
Externí odkaz:
http://arxiv.org/abs/2209.05706
Autor:
Mireshghallah, Fatemehsadat, Backurs, Arturs, Inan, Huseyin A, Wutschitz, Lukas, Kulkarni, Janardhan
Recent papers have shown that large pre-trained language models (LLMs) such as BERT, GPT-2 can be fine-tuned on private data to achieve performance comparable to non-private models for many downstream Natural Language Processing (NLP) tasks while sim
Externí odkaz:
http://arxiv.org/abs/2206.01838
Autor:
Mireshghallah, Fatemehsadat, Uniyal, Archit, Wang, Tianhao, Evans, David, Berg-Kirkpatrick, Taylor
Large language models are shown to present privacy risks through memorization of training data, and several recent works have studied such risks for the pre-training phase. Little attention, however, has been given to the fine-tuning phase and it is
Externí odkaz:
http://arxiv.org/abs/2205.12506
Autor:
Garcia, Mirian Hipolito, Manoel, Andre, Diaz, Daniel Madrigal, Mireshghallah, Fatemehsadat, Sim, Robert, Dimitriadis, Dimitrios
In this paper we introduce "Federated Learning Utilities and Tools for Experimentation" (FLUTE), a high-performance open-source platform for federated learning research and offline simulations. The goal of FLUTE is to enable rapid prototyping and sim
Externí odkaz:
http://arxiv.org/abs/2203.13789
Recent work on controlled text generation has either required attribute-based fine-tuning of the base language model (LM), or has restricted the parameterization of the attribute discriminator to be compatible with the base autoregressive LM. In this
Externí odkaz:
http://arxiv.org/abs/2203.13299