Výsledky vyhledávání - "Nie, Yuzhou."

Report

SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI

Autor: Yang, Yu, Nie, Yuzhou, Wang, Zhun, Tang, Yuheng, Guo, Wenbo, Li, Bo, Song, Dawn

Existing works have established multiple benchmarks to highlight the security risks associated with Code GenAI. These risks are primarily reflected in two areas: a model potential to generate insecure code (insecure coding) and its utility in cyberat

Externí odkaz: http://arxiv.org/abs/2410.11096

Zobrazit plný text záznamu

Report

RL-JACK: Reinforcement Learning-powered Black-box Jailbreaking Attack against LLMs

Autor: Chen, Xuan, Nie, Yuzhou, Yan, Lu, Mao, Yunshu, Guo, Wenbo, Zhang, Xiangyu

Modern large language model (LLM) developers typically conduct a safety alignment to prevent an LLM from generating unethical or harmful content. Recent studies have discovered that the safety alignment of LLMs can be bypassed by jailbreaking prompts

Externí odkaz: http://arxiv.org/abs/2406.08725

Zobrazit plný text záznamu

Report

When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search

Autor: Chen, Xuan, Nie, Yuzhou, Guo, Wenbo, Zhang, Xiangyu

Recent studies developed jailbreaking attacks, which construct jailbreaking prompts to fool LLMs into responding to harmful questions. Early-stage jailbreaking attacks require access to model internals or significant human efforts. More advanced atta

Externí odkaz: http://arxiv.org/abs/2406.08705

Zobrazit plný text záznamu

Report

TrojFM: Resource-efficient Backdoor Attacks against Very Large Foundation Models

Autor: Nie, Yuzhou., Wang, Yanting., Jia, Jinyuan., De Lucia, Michael J., Bastian, Nathaniel D., Guo, Wenbo., Song, Dawn.

One key challenge in backdoor attacks against large foundation models is the resource limits. Backdoor attacks usually require retraining the target model, which is impractical for very large foundation models. Existing backdoor attacks are mainly de

Externí odkaz: http://arxiv.org/abs/2405.16783

Zobrazit plný text záznamu