Zobrazeno 1 - 10
of 2 053
pro vyhledávání: '"Bennett, Andrew A."'
We study evaluating a policy under best- and worst-case perturbations to a Markov decision process (MDP), given transition observations from the original MDP, whether under the same or different policy. This is an important problem when there is the
Externí odkaz:
http://arxiv.org/abs/2404.00099
Publikováno v:
PMLR, Volume 238, 2024
Low-Rank Markov Decision Processes (MDPs) have recently emerged as a promising framework within the domain of reinforcement learning (RL), as they allow for provably approximately correct (PAC) learning guarantees while also incorporating ML algorith
Externí odkaz:
http://arxiv.org/abs/2311.03564
Autor:
Bruno, Barbara C., Cackowski, Celia, Frederick, John Adam, Vincent, Robert, Bennett, Andrew, Böttjer-Wilson, Daniela, Engels, Jennifer, Flight, Chris, Lang, Amy, Lawrence, Lisa Ayers, Smith, Bethany, Takacs, Jacqueline
Publikováno v:
Oceanography, 2024 Mar 01. 37(1), 54-59.
Externí odkaz:
https://www.jstor.org/stable/27301084
Autor:
Bennett, Andrew, Kallus, Nathan, Mao, Xiaojie, Newey, Whitney, Syrgkanis, Vasilis, Uehara, Masatoshi
We consider estimation of parameters defined as linear functionals of solutions to linear inverse problems. Any such parameter admits a doubly robust representation that depends on the solution to a dual linear inverse problem, where the dual solutio
Externí odkaz:
http://arxiv.org/abs/2307.13793
Autor:
Bennett, Andrew, Kallus, Nathan, Mao, Xiaojie, Newey, Whitney, Syrgkanis, Vasilis, Uehara, Masatoshi
In this paper, we study nonparametric estimation of instrumental variable (IV) regressions. Recently, many flexible machine learning methods have been developed for instrumental variable estimation. However, these methods have at least one of the fol
Externí odkaz:
http://arxiv.org/abs/2302.05404
Autor:
Sanni, Akeem1 (AUTHOR) aksanni@ttu.edu, Bennett, Andrew I.1 (AUTHOR) andy.bennett@ttu.edu, Huang, Yifan1 (AUTHOR) yifan.huang@ttu.edu, Gidi, Isabella1 (AUTHOR) isabellagidi@college.havard.edu, Adeniyi, Moyinoluwa1 (AUTHOR) jnwaiwu@ttu.edu, Nwaiwu, Judith1 (AUTHOR), Kang, Min H.2 (AUTHOR) min.kang@ttuhsc.edu, Keyel, Michelle E.2 (AUTHOR) patrick.reynolds@ttuhsc.edu, Gao, ChongFeng3 (AUTHOR) chongfeng.gao@vai.org, Reynolds, C. Patrick2 (AUTHOR), Brian, Haab3 (AUTHOR) brian.haab@vai.org, Mechref, Yehia1 (AUTHOR) yehia.mechref@ttu.edu
Publikováno v:
Cells (2073-4409). Oct2024, Vol. 13 Issue 19, p1640. 16p.
Safety is a crucial necessity in many applications of reinforcement learning (RL), whether robotic, automotive, or medical. Many existing approaches to safe RL rely on receiving numeric safety feedback, but in many cases this feedback can only take b
Externí odkaz:
http://arxiv.org/abs/2210.14492
Autor:
Bennett, Andrew, Kallus, Nathan, Mao, Xiaojie, Newey, Whitney, Syrgkanis, Vasilis, Uehara, Masatoshi
In a variety of applications, including nonparametric instrumental variable (NPIV) analysis, proximal causal inference under unmeasured confounding, and missing-not-at-random data with shadow variables, we are interested in inference on a continuous
Externí odkaz:
http://arxiv.org/abs/2208.08291
Autor:
Uehara, Masatoshi, Kiyohara, Haruka, Bennett, Andrew, Chernozhukov, Victor, Jiang, Nan, Kallus, Nathan, Shi, Chengchun, Sun, Wen
We study off-policy evaluation (OPE) for partially observable MDPs (POMDPs) with general function approximation. Existing methods such as sequential importance sampling estimators and fitted-Q evaluation suffer from the curse of horizon in POMDPs. To
Externí odkaz:
http://arxiv.org/abs/2207.13081
Autor:
Bennett, Andrew, Braumoeller, Bear F.
This paper analyzes the working or default assumptions researchers in the formal, statistical, and case study traditions typically hold regarding the sources of unexplained variance, the meaning of outliers, parameter values, human motivation, functi
Externí odkaz:
http://arxiv.org/abs/2202.08062