Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Protopapas, Kimon"'
Autor:
Protopapas, Kimon, Barakat, Anas
Policy Mirror Descent (PMD) stands as a versatile algorithmic framework encompassing several seminal policy gradient algorithms such as natural policy gradient, with connections with state-of-the-art reinforcement learning (RL) algorithms such as TRP
Externí odkaz:
http://arxiv.org/abs/2403.14156