Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Montesinos, Victoriano"'
Reinforcement learning (RL) requires either manually specifying a reward function, which is often infeasible, or learning a reward model from a large amount of human feedback, which is often very expensive. We study a more sample-efficient alternativ
Externí odkaz:
http://arxiv.org/abs/2310.12921
Autor:
Victor Herrero Mediavilla
This cumulated Index encompasses the four Biographical Archives for Spain, Portugal and the Iberoamerican countries with 770,000 biographical entries.