Výsledky vyhledávání - "Iranmanesh, Reihaneh"

Report

Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks

Autor: Gibbs, Tom, Kosak-Hine, Ethan, Ingebretsen, George, Zhang, Jason, Broomfield, Julius, Pieri, Sara, Iranmanesh, Reihaneh, Rabbany, Reihaneh, Pelrine, Kellin

Large language models (LLMs) are improving at an exceptional rate. However, these models are still susceptible to jailbreak attacks, which are becoming increasingly dangerous as models become increasingly powerful. In this work, we introduce a datase

Externí odkaz: http://arxiv.org/abs/2409.00137

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání