Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Oertell, Owen"'
Navigating the vast chemical space of druggable compounds is a formidable challenge in drug discovery, where generative models are increasingly employed to identify viable candidates. Conditional 3D structure-based drug design (3D-SBDD) models, which
Externí odkaz:
http://arxiv.org/abs/2410.20660
Autor:
Gao, Zhaolin, Chang, Jonathan D., Zhan, Wenhao, Oertell, Owen, Swamy, Gokul, Brantley, Kianté, Joachims, Thorsten, Bagnell, J. Andrew, Lee, Jason D., Sun, Wen
While originally developed for continuous control problems, Proximal Policy Optimization (PPO) has emerged as the work-horse of a variety of reinforcement learning (RL) applications, including the fine-tuning of generative models. Unfortunately, PPO
Externí odkaz:
http://arxiv.org/abs/2404.16767
Autor:
Chang, Jonathan D., Zhan, Wenhao, Oertell, Owen, Brantley, Kianté, Misra, Dipendra, Lee, Jason D., Sun, Wen
Reinforcement Learning (RL) from Human Preference-based feedback is a popular paradigm for fine-tuning generative models, which has produced impressive models such as GPT-4 and Claude3 Opus. This framework often consists of two steps: learning a rewa
Externí odkaz:
http://arxiv.org/abs/2404.08495
Reinforcement learning (RL) has improved guided image generation with diffusion models by directly optimizing rewards that capture image quality, aesthetics, and instruction following capabilities. However, the resulting generative policies inherit t
Externí odkaz:
http://arxiv.org/abs/2404.03673
In this paper, we prove that Distributional Reinforcement Learning (DistRL), which learns the return distribution, can obtain second-order bounds in both online and offline RL in general settings with function approximation. Second-order bounds are i
Externí odkaz:
http://arxiv.org/abs/2402.07198
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.