Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Lile, Nathan"'
Autor:
Mahan, Dakota, Van Phung, Duy, Rafailov, Rafael, Blagden, Chase, Lile, Nathan, Castricato, Louis, Fränken, Jan-Philipp, Finn, Chelsea, Albalak, Alon
Reinforcement Learning from Human Feedback (RLHF) has greatly improved the performance of modern Large Language Models (LLMs). The RLHF process is resource-intensive and technically challenging, generally requiring a large collection of human prefere
Externí odkaz:
http://arxiv.org/abs/2410.12832
The rapid advancement of language models (LMs) necessitates robust alignment with diverse user values. However, current preference optimization approaches often fail to capture the plurality of user opinions, instead reinforcing majority viewpoints a
Externí odkaz:
http://arxiv.org/abs/2407.17387
Autor:
Castricato, Louis, Lile, Nathan, Anand, Suraj, Schoelkopf, Hailey, Verma, Siddharth, Biderman, Stella
Existing methods for controlling language models, such as RLHF and Constitutional AI, involve determining which LLM behaviors are desirable and training them into a language model. However, in many cases, it is desirable for LLMs to be controllable a
Externí odkaz:
http://arxiv.org/abs/2402.07896