Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Bartoszcze, Łukasz"'
Autor:
Rosati, Domenic, Wehner, Jan, Williams, Kai, Bartoszcze, Łukasz, Atanasov, David, Gonzales, Robie, Majumdar, Subhabrata, Maple, Carsten, Sajjad, Hassan, Rudzicz, Frank
Releasing open-source large language models (LLMs) presents a dual-use risk since bad actors can easily fine-tune these models for harmful purposes. Even without the open release of weights, weight stealing and fine-tuning APIs make closed models vul
Externí odkaz:
http://arxiv.org/abs/2405.14577
Autor:
Rosati, Domenic, Wehner, Jan, Williams, Kai, Bartoszcze, Łukasz, Batzner, Jan, Sajjad, Hassan, Rudzicz, Frank
Large Language Models (LLMs) are often trained with safety guards intended to prevent harmful text generation. However, such safety training can be removed by fine-tuning the LLM on harmful datasets. While this emerging threat (harmful fine-tuning at
Externí odkaz:
http://arxiv.org/abs/2402.16382