Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Bharadwaj, Sudarshanan"'
In continuing tasks, average-reward reinforcement learning may be a more appropriate problem formulation than the more common discounted reward formulation. As usual, learning an optimal policy in this setting typically requires a large amount of tra
Externí odkaz:
http://arxiv.org/abs/2007.01498
Publikováno v:
Proceedings of the AAAI Conference on Artificial Intelligence. 35:7995-8003
In continuing tasks, average-reward reinforcement learning may be a more appropriate problem formulation than the more common discounted reward formulation. As usual, learning an optimal policy in this setting typically requires a large amount of tra
As autonomous systems become more widely used in society, they will necessarily have to make more decisions in order to meet increasingly complex objectives. However, to facilitate greater deployment of autonomous systems, especially in safety-critic
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::72d373acefa06cd52e635609a360ddbc
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.