Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Ni, Chengzhuo"'
Generative AI has redefined artificial intelligence, enabling the creation of innovative content and customized solutions that drive business practices into a new era of efficiency and creativity. In this paper, we focus on diffusion models, a powerf
Externí odkaz:
http://arxiv.org/abs/2403.13219
We explore the methodology and theory of reward-directed generation via conditional diffusion models. Directed generation aims to generate samples with desired properties as measured by a reward function, which has broad applications in generative AI
Externí odkaz:
http://arxiv.org/abs/2307.07055
We study multi-agent general-sum Markov games with nonlinear function approximation. We focus on low-rank Markov games whose transition matrix admits a hidden low-rank structure on top of an unknown non-linear representation. The goal is to design an
Externí odkaz:
http://arxiv.org/abs/2210.16976
Autor:
Yuan, Hui, Ni, Chengzhuo, Wang, Huazheng, Zhang, Xuezhou, Cong, Le, Szepesvári, Csaba, Wang, Mengdi
Directed Evolution (DE), a landmark wet-lab method originated in 1960s, enables discovery of novel protein designs via evolving a population of candidate sequences. Recent advances in biotechnology has made it possible to collect high-throughput data
Externí odkaz:
http://arxiv.org/abs/2206.02092
Off-Policy Evaluation (OPE) serves as one of the cornerstones in Reinforcement Learning (RL). Fitted Q Evaluation (FQE) with various function approximators, especially deep neural networks, has gained practical success. While statistical analysis has
Externí odkaz:
http://arxiv.org/abs/2202.04970
Policy gradient (PG) estimation becomes a challenge when we are not allowed to sample with the target policy but only have access to a dataset generated by some unknown behavior policy. Conventional methods for off-policy PG estimation often suffer f
Externí odkaz:
http://arxiv.org/abs/2202.00076
The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure. This paper proposes a tensor-inspired unsupervised learning method to identify meaningful low-dimensional state and action representat
Externí odkaz:
http://arxiv.org/abs/2105.01136
Policy gradient (PG) gives rise to a rich class of reinforcement learning (RL) methods. Recently, there has been an emerging trend to accelerate the existing PG methods such as REINFORCE by the \emph{variance reduction} techniques. However, all exist
Externí odkaz:
http://arxiv.org/abs/2102.08607
We study online reinforcement learning for finite-horizon deterministic control systems with {\it arbitrary} state and action spaces. Suppose that the transition dynamics and reward function is unknown, but the state and action space is endowed with
Externí odkaz:
http://arxiv.org/abs/1905.01576
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.