Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Neopane, Ojash"'
Estimation of the Average Treatment Effect (ATE) is a core problem in causal inference with strong connections to Off-Policy Evaluation in Reinforcement Learning. This paper considers the problem of adaptively selecting the treatment allocation proba
Externí odkaz:
http://arxiv.org/abs/2411.14341
Autor:
Mehta, Viraj, Das, Vikramjeet, Neopane, Ojash, Dai, Yijia, Bogunovic, Ilija, Schneider, Jeff, Neiswanger, Willie
Preference-based feedback is important for many applications in reinforcement learning where direct evaluation of a reward function is not feasible. A notable recent example arises in reinforcement learning from human feedback (RLHF) on large languag
Externí odkaz:
http://arxiv.org/abs/2312.00267
Preference-based feedback is important for many applications where direct evaluation of a reward function is not feasible. A notable recent example arises in reinforcement learning from human feedback on large language models. For many of these appli
Externí odkaz:
http://arxiv.org/abs/2307.11288
We consider a variant of the best arm identification (BAI) problem in multi-armed bandits (MAB) in which there are two sets of arms (source and target), and the objective is to determine the best target arm while only pulling source arms. In this pap
Externí odkaz:
http://arxiv.org/abs/2112.04083
In recent years deep learning algorithms have shown extremely high performance on machine learning tasks such as image classification and speech recognition. In support of such applications, various FPGA accelerator architectures have been proposed f
Externí odkaz:
http://arxiv.org/abs/1705.02583
Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in probabilistic generative model applications such as image occlusion removal, pattern completion and motion synthesis. Generative inference in such algorithms can be
Externí odkaz:
http://arxiv.org/abs/1602.05996