Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Winata, Gusti"'
Autor:
Gureja, Srishti, Miranda, Lester James V., Islam, Shayekh Bin, Maheshwary, Rishabh, Sharma, Drishti, Winata, Gusti, Lambert, Nathan, Ruder, Sebastian, Hooker, Sara, Fadaee, Marzieh
Reward models (RMs) have driven the state-of-the-art performance of LLMs today by enabling the integration of human feedback into the language modeling process. However, RMs are primarily trained and evaluated in English, and their capabilities in mu
Externí odkaz:
http://arxiv.org/abs/2410.15522