Zobrazeno 1 - 10
of 38
pro vyhledávání: '"Chung, Wesley"'
Loss of plasticity, trainability loss, and primacy bias have been identified as issues arising when training deep neural networks on sequences of tasks -- all referring to the increased difficulty in training on new tasks. We propose to use Parseval
Externí odkaz:
http://arxiv.org/abs/2412.07224
We study the effect of baselines in on-policy stochastic policy gradient optimization, and close the gap between the theory and practice of policy optimization methods. Our first contribution is to show that the \emph{state value} baseline allows on-
Externí odkaz:
http://arxiv.org/abs/2301.06276
Bandit and reinforcement learning (RL) problems can often be framed as optimization problems where the goal is to maximize average performance while having access only to stochastic estimates of the true gradient. Traditionally, stochastic optimizati
Externí odkaz:
http://arxiv.org/abs/2008.13773
Temporal difference methods enable efficient estimation of value functions in reinforcement learning in an incremental fashion, and are of broader interest because they correspond learning as observed in biological systems. Standard value functions c
Externí odkaz:
http://arxiv.org/abs/1907.04651
Importance sampling (IS) is a common reweighting strategy for off-policy prediction in reinforcement learning. While it is consistent and unbiased, it can result in high variance updates to the weights for the value function. In this work, we explore
Externí odkaz:
http://arxiv.org/abs/1906.04328
Estimating the value function for a fixed policy is a fundamental problem in reinforcement learning. Policy evaluation algorithms---to estimate value functions---continue to be developed, to improve convergence rates, improve stability and handle var
Externí odkaz:
http://arxiv.org/abs/1808.09127
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Krupp, Anna, Martino, Michael Di, Chung, Wesley, Krisda Chaiyachati, Anish K. Agarwal, Huffenberger, Ann Marie, Laudanski, Krzysztof
Additional file 1. “Tele-medicine – BMC Semi-structured Interview” – Provider and Patient Perspective on the Value of Direct-to-consumer Telehealth for Urgent Care: Telemedicine Provider Semi-structured Interview Guide – V4.
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::232b25e81a44cdb4947ec20c8a1b4d0b
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.