Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Glukhov, David"'
Vulnerability of Frontier language models to misuse and jailbreaks has prompted the development of safety measures like filters and alignment training in an effort to ensure safety through robustness to adversarially crafted prompts. We assert that r
Externí odkaz:
http://arxiv.org/abs/2407.02551
Large language models (LLMs) have exhibited impressive capabilities in comprehending complex instructions. However, their blind adherence to provided instructions has led to concerns regarding risks of malicious use. Existing defence mechanisms, such
Externí odkaz:
http://arxiv.org/abs/2307.10719
Autor:
Wu, Jiapeng, Ghomi, Atiyeh Ashari, Glukhov, David, Cresswell, Jesse C., Boenisch, Franziska, Papernot, Nicolas
Machine learning models are susceptible to a variety of attacks that can erode trust, including attacks against the privacy of training data, and adversarial examples that jeopardize model accuracy. Differential privacy and certified robustness are e
Externí odkaz:
http://arxiv.org/abs/2306.08656
Publikováno v:
In Nano-Structures & Nano-Objects February 2023 33
Autor:
Sanjeev, Abhijit1,2 (AUTHOR) abhijitsanjeevk@gmail.com, Glukhov, David3,4 (AUTHOR), Salahudeen Rafeeka, Rinsa1,2 (AUTHOR), Karsenty, Avi3,4 (AUTHOR), Zalevsky, Zeev1,2 (AUTHOR)
Publikováno v:
Scientific Reports. 11/18/2023, Vol. 13 Issue 1, p1-13. 13p.
Publikováno v:
In Applied Surface Science 30 June 2022 588
Autor:
Glukhov, David1,2 (AUTHOR), Zalevsky, Zeev3,4 (AUTHOR), Karsenty, Avi1,2 (AUTHOR) karsenty@jct.ac.il
Publikováno v:
Scientific Reports. 1/28/2022, Vol. 12 Issue 1, p1-20. 20p.
Autor:
Glukhov, David1,2 (AUTHOR), Zalevsky, Zeev3,4 (AUTHOR), Karsenty, Avi1,2 (AUTHOR) karsenty@jct.ac.il
Publikováno v:
Scientific Reports. 2/22/2022, Vol. 12 Issue 1, p1-2. 2p.