Výsledky vyhledávání - "Subramani, Rohan"

Report

Generalization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure Domains

Autor: Clymer, Joshua, Baker, Garrett, Subramani, Rohan, Wang, Sam

As AI systems become more intelligent and their behavior becomes more challenging to assess, they may learn to game the flaws of human feedback instead of genuinely striving to follow instructions; however, this risk can be mitigated by controlling h

Externí odkaz: http://arxiv.org/abs/2311.07723

Zobrazit plný text záznamu

Report

On The Expressivity of Objective-Specification Formalisms in Reinforcement Learning

Autor: Subramani, Rohan, Williams, Marcus, Heitmann, Max, Holm, Halfdan, Griffin, Charlie, Skalse, Joar

Most algorithms in reinforcement learning (RL) require that the objective is formalised with a Markovian reward function. However, it is well-known that certain tasks cannot be expressed by means of an objective in the Markov rewards formalism, motiv

Externí odkaz: http://arxiv.org/abs/2310.11840

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání