Zobrazeno 1 - 1
of 1
pro vyhledávání: '"van der Vaart, Pascal R."'
Autor:
Oren, Yaniv, Zanger, Moritz A., van der Vaart, Pascal R., Spaan, Matthijs T. J., Bohmer, Wendelin
Many modern reinforcement learning algorithms build on the actor-critic (AC) framework: iterative improvement of a policy (the actor) using policy improvement operators and iterative approximation of the policy's value (the critic). In contrast, the
Externí odkaz:
http://arxiv.org/abs/2406.01423