Výsledky vyhledávání - "van der Vaart, Pascal R."

Report

Autor: Oren, Yaniv, Zanger, Moritz A., van der Vaart, Pascal R., Spaan, Matthijs T. J., Bohmer, Wendelin

Many modern reinforcement learning algorithms build on the actor-critic (AC) framework: iterative improvement of a policy (the actor) using policy improvement operators and iterative approximation of the policy's value (the critic). In contrast, the

Externí odkaz: http://arxiv.org/abs/2406.01423

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání