Výsledky vyhledávání - "Parr, Ronald"

Report

Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy

Autor: Allen, Cameron, Kirtland, Aaron, Tao, Ruo Yu, Lobel, Sam, Scott, Daniel, Petrocelli, Nicholas, Gottesman, Omer, Parr, Ronald, Littman, Michael L., Konidaris, George

Reinforcement learning algorithms typically rely on the assumption that the environment dynamics and value function can be expressed in terms of a Markovian state representation. However, when state information is only partially observable, how can a

Externí odkaz: http://arxiv.org/abs/2407.07333

Zobrazit plný text záznamu

Report

Amazing Things Come From Having Many Good Models

Autor: Rudin, Cynthia, Zhong, Chudi, Semenova, Lesia, Seltzer, Margo, Parr, Ronald, Liu, Jiachang, Katta, Srikar, Donnelly, Jon, Chen, Harry, Boner, Zachery

Publikováno v: ICML (spotlight), 2024

The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, b

Externí odkaz: http://arxiv.org/abs/2407.04846

Zobrazit plný text záznamu

Report

An Optimal Tightness Bound for the Simulation Lemma

Autor: Lobel, Sam, Parr, Ronald

We present a bound for value-prediction error with respect to model misspecification that is tight, including constant factors. This is a direct improvement of the "simulation lemma," a foundational result in reinforcement learning. We demonstrate th

Externí odkaz: http://arxiv.org/abs/2406.16249

Zobrazit plný text záznamu

Report

A Path to Simpler Models Starts With Noise

Autor: Semenova, Lesia, Chen, Harry, Parr, Ronald, Rudin, Cynthia

The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabul

Externí odkaz: http://arxiv.org/abs/2310.19726

Zobrazit plný text záznamu

Report

Fitted Q-Learning for Relational Domains

Autor: Das, Srijita, Natarajan, Sriraam, Roy, Kaushik, Parr, Ronald, Kersting, Kristian

We consider the problem of Approximate Dynamic Programming in relational domains. Inspired by the success of fitted Q-learning methods in propositional settings, we develop the first relational fitted Q-learning algorithms by representing the value f

Externí odkaz: http://arxiv.org/abs/2006.05595

Zobrazit plný text záznamu

Report

Deep Radial-Basis Value Functions for Continuous Control

Autor: Asadi, Kavosh, Parikh, Neev, Parr, Ronald E., Konidaris, George D., Littman, Michael L.

A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep ra

Externí odkaz: http://arxiv.org/abs/2002.01883

Zobrazit plný text záznamu

Report

On the Existence of Simpler Machine Learning Models

Autor: Semenova, Lesia, Rudin, Cynthia, Parr, Ronald

Publikováno v: 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT'22)

It is almost always easier to find an accurate-but-complex model than an accurate-yet-simple model. Finding optimal, sparse, accurate models of various forms (linear models with integer coefficients, decision sets, rule lists, decision trees) is gene

Externí odkaz: http://arxiv.org/abs/1908.01755

Zobrazit plný text záznamu

Report

Revisiting the Softmax Bellman Operator: New Benefits and New Perspective

Autor: Song, Zhao, Parr, Ronald E., Carin, Lawrence

The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contraction properties of the Bellman operator. Surprising

Externí odkaz: http://arxiv.org/abs/1812.00456

Zobrazit plný text záznamu

Report

Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (2007)

Autor: Parr, Ronald, van der Gaag, Linda S.

This is the Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence, which was held in Vancouver, British Columbia, July 19 - 22 2007.

Externí odkaz: http://arxiv.org/abs/1208.5155

Zobrazit plný text záznamu

Report

Greedy Algorithms for Sparse Reinforcement Learning

Autor: Painter-Wakefield, Christopher, Parr, Ronald

Feature selection and regularization are becoming increasingly prominent tools in the efforts of the reinforcement learning (RL) community to expand the reach and applicability of RL. One approach to the problem of feature selection is to impose a sp

Externí odkaz: http://arxiv.org/abs/1206.6485

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání