Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Subramani, Rohan"'
As AI systems become more intelligent and their behavior becomes more challenging to assess, they may learn to game the flaws of human feedback instead of genuinely striving to follow instructions; however, this risk can be mitigated by controlling h
Externí odkaz:
http://arxiv.org/abs/2311.07723
Autor:
Subramani, Rohan, Williams, Marcus, Heitmann, Max, Holm, Halfdan, Griffin, Charlie, Skalse, Joar
Most algorithms in reinforcement learning (RL) require that the objective is formalised with a Markovian reward function. However, it is well-known that certain tasks cannot be expressed by means of an objective in the Markov rewards formalism, motiv
Externí odkaz:
http://arxiv.org/abs/2310.11840