Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Jajoo, Pranaya"'
Autor:
Sikchi, Harshit, Agarwal, Siddhant, Jajoo, Pranaya, Parajuli, Samyak, Chuck, Caleb, Rudolph, Max, Stone, Peter, Zhang, Amy, Niekum, Scott
Rewards remain an uninterpretable way to specify tasks for Reinforcement Learning, as humans are often unable to predict the optimal behavior of any given reward function, leading to poor reward design and reward hacking. Language presents an appeali
Externí odkaz:
http://arxiv.org/abs/2412.05718