Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Singhal, Shivam"'
Because it is difficult to precisely specify complex objectives, reinforcement learning policies are often optimized using flawed proxy rewards that seem to capture the true objective. However, optimizing proxy rewards frequently leads to reward hack
Externí odkaz:
http://arxiv.org/abs/2403.03185
Publikováno v:
2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) (pp. 1-8). IEEE
For robots to operate in a three dimensional world and interact with humans, learning spatial relationships among objects in the surrounding is necessary. Reasoning about the state of the world requires inputs from many different sensory modalities i
Externí odkaz:
http://arxiv.org/abs/2108.01254
Publikováno v:
Global Advances in Integrative Medicine & Health; 10/23/2023, p1-13, 13p
Publikováno v:
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference [Annu Int Conf IEEE Eng Med Biol Soc] 2019 Jul; Vol. 2019, pp. 3290-3296.