Zobrazeno 1 - 10
of 50
pro vyhledávání: '"Jiang, Daniel R."'
Autor:
Zhan, Wenhao, Fujimoto, Scott, Zhu, Zheqing, Lee, Jason D., Jiang, Daniel R., Efroni, Yonathan
We study the problem of learning an approximate equilibrium in the offline multi-agent reinforcement learning (MARL) setting. We introduce a structural assumption -- the interaction rank -- and establish that functions with low interaction rank are s
Externí odkaz:
http://arxiv.org/abs/2410.01101
Adaptive experimentation can significantly improve statistical power, but standard algorithms overlook important practical issues including batched and delayed feedback, personalization, non-stationarity, multiple objectives, and constraints. To addr
Externí odkaz:
http://arxiv.org/abs/2408.04570
Innovations across science and industry are evaluated using randomized trials (a.k.a. A/B tests). While simple and robust, such static designs are inefficient or infeasible for testing many hypotheses. Adaptive designs can greatly improve statistical
Externí odkaz:
http://arxiv.org/abs/2408.04531
Autor:
Shar, Ibrahim El, Jiang, Daniel R.
We propose weakly coupled deep Q-networks (WCDQN), a novel deep reinforcement learning algorithm that enhances performance in a class of structured problems called weakly coupled Markov decision processes (WCMDP). WCMDPs consist of multiple independe
Externí odkaz:
http://arxiv.org/abs/2310.18803
Autor:
Wang, Yijia, Jiang, Daniel R.
We consider infinite horizon Markov decision processes (MDPs) with fast-slow structure, meaning that certain parts of the state space move "fast" (and in a sense, are more influential) while other parts transition more "slowly." Such structure is com
Externí odkaz:
http://arxiv.org/abs/2301.00922
Bayesian optimization (BO) is a sample-efficient approach to optimizing costly-to-evaluate black-box functions. Most BO methods ignore how evaluation costs may vary over the optimization domain. However, these costs can be highly heterogeneous and ar
Externí odkaz:
http://arxiv.org/abs/2111.06537
Autor:
Jiang, Shali, Jiang, Daniel R., Balandat, Maximilian, Karrer, Brian, Gardner, Jacob R., Garnett, Roman
Bayesian optimization is a sequential decision making framework for optimizing expensive-to-evaluate black-box functions. Computing a full lookahead policy amounts to solving a highly intractable stochastic dynamic program. Myopic approaches, such as
Externí odkaz:
http://arxiv.org/abs/2006.15779
Autor:
Shar, Ibrahim El, Jiang, Daniel R.
We introduce the lookahead-bounded Q-learning (LBQL) algorithm, a new, provably convergent variant of Q-learning that seeks to improve the performance of standard Q-learning in stochastic environments through the use of ``lookahead'' upper and lower
Externí odkaz:
http://arxiv.org/abs/2006.15690
Publikováno v:
Transactions on Machine Learning Research (09/2023)
Reinforcement learning in sparse-reward navigation environments with expensive and limited interactions is challenging and poses a need for effective exploration. Motivated by complex navigation tasks that require real-world training (when cheap simu
Externí odkaz:
http://arxiv.org/abs/1910.09143
Autor:
Balandat, Maximilian, Karrer, Brian, Jiang, Daniel R., Daulton, Samuel, Letham, Benjamin, Wilson, Andrew Gordon, Bakshy, Eytan
Publikováno v:
Advances in Neural Information Processing Systems 33, 2020
Bayesian optimization provides sample-efficient global optimization for a broad range of applications, including automatic machine learning, engineering, physics, and experimental design. We introduce BoTorch, a modern programming framework for Bayes
Externí odkaz:
http://arxiv.org/abs/1910.06403