Výsledky vyhledávání - "Panda, Ashwinee"

Report

Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs

Autor: Panda, Ashwinee, Isik, Berivan, Qi, Xiangyu, Koyejo, Sanmi, Weissman, Tsachy, Mittal, Prateek

Existing methods for adapting large language models (LLMs) to new tasks are not suited to multi-task adaptation because they modify all the model weights -- causing destructive interference between tasks. The resulting effects, such as catastrophic f

Externí odkaz: http://arxiv.org/abs/2406.16797

Zobrazit plný text záznamu

Report

Safety Alignment Should Be Made More Than Just a Few Tokens Deep

Autor: Qi, Xiangyu, Panda, Ashwinee, Lyu, Kaifeng, Ma, Xiao, Roy, Subhrajit, Beirami, Ahmad, Mittal, Prateek, Henderson, Peter

The safety alignment of current Large Language Models (LLMs) is vulnerable. Relatively simple attacks, or even benign fine-tuning, can jailbreak aligned models. We argue that many of these vulnerabilities are related to a shared underlying issue: saf

Externí odkaz: http://arxiv.org/abs/2406.05946

Zobrazit plný text záznamu

Report

Teach LLMs to Phish: Stealing Private Information from Language Models

Autor: Panda, Ashwinee, Choquette-Choo, Christopher A., Zhang, Zhengming, Yang, Yaoqing, Mittal, Prateek

When large language models are trained on private data, it can be a significant privacy risk for them to memorize and regurgitate sensitive information. In this work, we propose a new practical data extraction attack that we call "neural phishing". T

Externí odkaz: http://arxiv.org/abs/2403.00871

Zobrazit plný text záznamu

Report

Private Fine-tuning of Large Language Models with Zeroth-order Optimization

Autor: Tang, Xinyu, Panda, Ashwinee, Nasr, Milad, Mahloujifar, Saeed, Mittal, Prateek

Differentially private stochastic gradient descent (DP-SGD) allows models to be trained in a privacy-preserving manner, but has proven difficult to scale to the era of foundation models. We introduce DP-ZO, a private fine-tuning framework for large l

Externí odkaz: http://arxiv.org/abs/2401.04343

Zobrazit plný text záznamu

Report

Visual Adversarial Examples Jailbreak Aligned Large Language Models

Autor: Qi, Xiangyu, Huang, Kaixuan, Panda, Ashwinee, Henderson, Peter, Wang, Mengdi, Mittal, Prateek

Recently, there has been a surge of interest in integrating vision into Large Language Models (LLMs), exemplified by Visual Language Models (VLMs) such as Flamingo and GPT-4. This paper sheds light on the security and safety implications of this tren

Externí odkaz: http://arxiv.org/abs/2306.13213

Zobrazit plný text záznamu

Report

Differentially Private Image Classification by Learning Priors from Random Processes

Autor: Tang, Xinyu, Panda, Ashwinee, Sehwag, Vikash, Mittal, Prateek

In privacy-preserving machine learning, differentially private stochastic gradient descent (DP-SGD) performs worse than SGD due to per-sample gradient clipping and noise addition. A recent focus in private learning research is improving the performan

Externí odkaz: http://arxiv.org/abs/2306.06076

Zobrazit plný text záznamu

Report

Privacy-Preserving In-Context Learning for Large Language Models

Autor: Wu, Tong, Panda, Ashwinee, Wang, Jiachen T., Mittal, Prateek

In-context learning (ICL) is an important capability of Large Language Models (LLMs), enabling these models to dynamically adapt based on specific, in-context exemplars, thereby improving accuracy and relevance. However, LLM's responses may leak the

Externí odkaz: http://arxiv.org/abs/2305.01639

Zobrazit plný text záznamu

Report

A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization

Autor: Panda, Ashwinee, Tang, Xinyu, Mahloujifar, Saeed, Sehwag, Vikash, Mittal, Prateek

An open problem in differentially private deep learning is hyperparameter optimization (HPO). DP-SGD introduces new hyperparameters and complicates existing ones, forcing researchers to painstakingly tune hyperparameters with hundreds of trials, whic

Externí odkaz: http://arxiv.org/abs/2212.04486

Zobrazit plný text záznamu

Report

Neurotoxin: Durable Backdoors in Federated Learning

Autor: Zhang, Zhengming, Panda, Ashwinee, Song, Linyue, Yang, Yaoqing, Mahoney, Michael W., Gonzalez, Joseph E., Ramchandran, Kannan, Mittal, Prateek

Due to their decentralized nature, federated learning (FL) systems have an inherent vulnerability during their training to adversarial backdoor attacks. In this type of attack, the goal of the attacker is to use poisoned updates to implant so-called

Externí odkaz: http://arxiv.org/abs/2206.10341

Zobrazit plný text záznamu

Report

SparseFed: Mitigating Model Poisoning Attacks in Federated Learning with Sparsification

Autor: Panda, Ashwinee, Mahloujifar, Saeed, Bhagoji, Arjun N., Chakraborty, Supriyo, Mittal, Prateek

Federated learning is inherently vulnerable to model poisoning attacks because its decentralized nature allows attackers to participate with compromised devices. In model poisoning attacks, the attacker reduces the model's performance on targeted sub

Externí odkaz: http://arxiv.org/abs/2112.06274

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání