Zobrazeno 1 - 10
of 44
pro vyhledávání: '"Parr, Ronald"'
Autor:
Allen, Cameron, Kirtland, Aaron, Tao, Ruo Yu, Lobel, Sam, Scott, Daniel, Petrocelli, Nicholas, Gottesman, Omer, Parr, Ronald, Littman, Michael L., Konidaris, George
Reinforcement learning algorithms typically rely on the assumption that the environment dynamics and value function can be expressed in terms of a Markovian state representation. However, when state information is only partially observable, how can a
Externí odkaz:
http://arxiv.org/abs/2407.07333
Autor:
Rudin, Cynthia, Zhong, Chudi, Semenova, Lesia, Seltzer, Margo, Parr, Ronald, Liu, Jiachang, Katta, Srikar, Donnelly, Jon, Chen, Harry, Boner, Zachery
Publikováno v:
ICML (spotlight), 2024
The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, b
Externí odkaz:
http://arxiv.org/abs/2407.04846
Autor:
Lobel, Sam, Parr, Ronald
We present a bound for value-prediction error with respect to model misspecification that is tight, including constant factors. This is a direct improvement of the "simulation lemma," a foundational result in reinforcement learning. We demonstrate th
Externí odkaz:
http://arxiv.org/abs/2406.16249
The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabul
Externí odkaz:
http://arxiv.org/abs/2310.19726
We consider the problem of Approximate Dynamic Programming in relational domains. Inspired by the success of fitted Q-learning methods in propositional settings, we develop the first relational fitted Q-learning algorithms by representing the value f
Externí odkaz:
http://arxiv.org/abs/2006.05595
A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep ra
Externí odkaz:
http://arxiv.org/abs/2002.01883
Publikováno v:
2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT'22)
It is almost always easier to find an accurate-but-complex model than an accurate-yet-simple model. Finding optimal, sparse, accurate models of various forms (linear models with integer coefficients, decision sets, rule lists, decision trees) is gene
Externí odkaz:
http://arxiv.org/abs/1908.01755
The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contraction properties of the Bellman operator. Surprising
Externí odkaz:
http://arxiv.org/abs/1812.00456
Autor:
Parr, Ronald, van der Gaag, Linda S.
This is the Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence, which was held in Vancouver, British Columbia, July 19 - 22 2007.
Externí odkaz:
http://arxiv.org/abs/1208.5155
Feature selection and regularization are becoming increasingly prominent tools in the efforts of the reinforcement learning (RL) community to expand the reach and applicability of RL. One approach to the problem of feature selection is to impose a sp
Externí odkaz:
http://arxiv.org/abs/1206.6485