Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Entezami, Erfan"'
Training a model-free reinforcement learning agent requires allowing the agent to sufficiently explore the environment to search for an optimal policy. In safety-constrained environments, utilizing unsupervised exploration or a non-optimal policy may
Externí odkaz:
http://arxiv.org/abs/2408.00997