Výsledky vyhledávání - "Pokorny, Rai Michael"

Report

Autor: McAleese, Nat, Pokorny, Rai Michael, Uribe, Juan Felipe Ceron, Nitishinskaya, Evgenia, Trebacz, Maja, Leike, Jan

Reinforcement learning from human feedback (RLHF) is fundamentally limited by the capacity of humans to correctly evaluate model output. To improve human evaluation ability and overcome that limitation this work trains "critic" models that help human

Externí odkaz: http://arxiv.org/abs/2407.00215

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání