Výsledky vyhledávání - "Gottesman, Omer"

Report

Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy

Autor: Allen, Cameron, Kirtland, Aaron, Tao, Ruo Yu, Lobel, Sam, Scott, Daniel, Petrocelli, Nicholas, Gottesman, Omer, Parr, Ronald, Littman, Michael L., Konidaris, George

Reinforcement learning algorithms typically rely on the assumption that the environment dynamics and value function can be expressed in terms of a Markovian state representation. However, when state information is only partially observable, how can a

Externí odkaz: http://arxiv.org/abs/2407.07333

Zobrazit plný text záznamu

Report

TD Convergence: An Optimization Perspective

Autor: Asadi, Kavosh, Sabach, Shoham, Liu, Yao, Gottesman, Omer, Fakoor, Rasool

We study the convergence behavior of the celebrated temporal-difference (TD) learning algorithm. By looking at the algorithm through the lens of optimization, we first argue that TD can be viewed as an iterative optimization algorithm where the funct

Externí odkaz: http://arxiv.org/abs/2306.17750

Zobrazit plný text záznamu

Report

Decision-Focused Model-based Reinforcement Learning for Reward Transfer

Autor: Sharma, Abhishek, Parbhoo, Sonali, Gottesman, Omer, Doshi-Velez, Finale

Decision-focused (DF) model-based reinforcement learning has recently been introduced as a powerful algorithm that can focus on learning the MDP dynamics that are most relevant for obtaining high returns. While this approach increases the agent's per

Externí odkaz: http://arxiv.org/abs/2304.03365

Zobrazit plný text záznamu

Report

On the Geometry of Reinforcement Learning in Continuous State and Action Spaces

Autor: Tiwari, Saket, Gottesman, Omer, Konidaris, George

Advances in reinforcement learning have led to its successful application in complex tasks with continuous state and action spaces. Despite these advances in practice, most theoretical work pertains to finite state and action spaces. We propose build

Externí odkaz: http://arxiv.org/abs/2301.00009

Zobrazit plný text záznamu

Report

A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes

Autor: Zhang, Kelly W., Gottesman, Omer, Doshi-Velez, Finale

In the reinforcement learning literature, there are many algorithms developed for either Contextual Bandit (CB) or Markov Decision Processes (MDP) environments. However, when deploying reinforcement learning algorithms in the real world, even with do

Externí odkaz: http://arxiv.org/abs/2208.00250

Zobrazit plný text záznamu

Report

Faster Deep Reinforcement Learning with Slower Online Network

Autor: Asadi, Kavosh, Fakoor, Rasool, Gottesman, Omer, Kim, Taesup, Littman, Michael L., Smola, Alexander J.

Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issu

Externí odkaz: http://arxiv.org/abs/2112.05848

Zobrazit plný text záznamu

Report

Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation

Autor: Keramati, Ramtin, Gottesman, Omer, Celi, Leo Anthony, Doshi-Velez, Finale, Brunskill, Emma

Off-policy policy evaluation methods for sequential decision making can be used to help identify if a proposed decision policy is better than a current baseline policy. However, a new decision policy may be better than a baseline policy for some indi

Externí odkaz: http://arxiv.org/abs/2111.14272

Zobrazit plný text záznamu

Report

Coarse-Grained Smoothness for RL in Metric Spaces

Autor: Gottesman, Omer, Asadi, Kavosh, Allen, Cameron, Lobel, Sam, Konidaris, George, Littman, Michael

Principled decision-making in continuous state--action spaces is impossible without some assumptions. A common approach is to assume Lipschitz continuity of the Q-function. We show that, unfortunately, this property fails to hold in many typical doma

Externí odkaz: http://arxiv.org/abs/2110.12276

Zobrazit plný text záznamu

Report

State Relevance for Off-Policy Evaluation

Autor: Shen, Simon P., Ma, Yecheng Jason, Gottesman, Omer, Doshi-Velez, Finale

Publikováno v: Proceedings of the 38th International Conference on Machine Learning, PMLR 139:9537-9546, 2021

Importance sampling-based estimators for off-policy evaluation (OPE) are valued for their simplicity, unbiasedness, and reliance on relatively few assumptions. However, the variance of these estimators is often high, especially when trajectories are

Externí odkaz: http://arxiv.org/abs/2109.06310

Zobrazit plný text záznamu

Report

Learning Markov State Abstractions for Deep Reinforcement Learning

Autor: Allen, Cameron, Parikh, Neev, Gottesman, Omer, Konidaris, George

A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically learn by way of an abstract state representa

Externí odkaz: http://arxiv.org/abs/2106.04379

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání