Zobrazeno 1 - 10
of 62
pro vyhledávání: '"Modayil, Joseph"'
Modern machine learning systems have demonstrated substantial abilities with methods that either embrace or ignore human-provided knowledge, but combining benefits of both styles remains a challenge. One particular challenge involves designing learni
Externí odkaz:
http://arxiv.org/abs/2408.04242
Autor:
Modayil, Joseph, Abbas, Zaheer
Conventional reinforcement learning (RL) algorithms exhibit broad generality in their theoretical formulation and high performance on several challenging domains when combined with powerful function approximation. However, developing RL algorithms th
Externí odkaz:
http://arxiv.org/abs/2311.02215
The ability to learn continually is essential in a complex and changing world. In this paper, we characterize the behavior of canonical value-based deep reinforcement learning (RL) approaches under varying degrees of non-stationarity. In particular,
Externí odkaz:
http://arxiv.org/abs/2303.07507
Autor:
Pilarski, Patrick M., Butcher, Andrew, Davoodi, Elnaz, Johanson, Michael Bradley, Brenneis, Dylan J. A., Parker, Adam S. R., Acker, Leslie, Botvinick, Matthew M., Modayil, Joseph, White, Adam
Learned communication between agents is a powerful tool when approaching decision-making problems that are hard to overcome by any single agent in isolation. However, continual coordination and communication learning between machine agents or human-m
Externí odkaz:
http://arxiv.org/abs/2203.09498
Autor:
Butcher, Andrew, Johanson, Michael Bradley, Davoodi, Elnaz, Brenneis, Dylan J. A., Acker, Leslie, Parker, Adam S. R., White, Adam, Modayil, Joseph, Pilarski, Patrick M.
In this paper, we contribute a multi-faceted study into Pavlovian signalling -- a process by which learned, temporally extended predictions made by one agent inform decision-making by another agent. Signalling is intimately connected to time and timi
Externí odkaz:
http://arxiv.org/abs/2201.03709
Autor:
Brenneis, Dylan J. A., Parker, Adam S., Johanson, Michael Bradley, Butcher, Andrew, Davoodi, Elnaz, Acker, Leslie, Botvinick, Matthew M., Modayil, Joseph, White, Adam, Pilarski, Patrick M.
Artificial intelligence systems increasingly involve continual learning to enable flexibility in general situations that are not encountered during system training. Human interaction with autonomous systems is broadly studied, but research has hither
Externí odkaz:
http://arxiv.org/abs/2112.07774
Autor:
Martin, John D., Modayil, Joseph
The performance of a reinforcement learning (RL) system depends on the computational architecture used to approximate a value function. Deep learning methods provide both optimization techniques and architectures for approximating nonlinear functions
Externí odkaz:
http://arxiv.org/abs/2106.09776
Many deep reinforcement learning algorithms contain inductive biases that sculpt the agent's objective and its interface to the environment. These inductive biases can take many forms, including domain knowledge and pretuned hyper-parameters. In gene
Externí odkaz:
http://arxiv.org/abs/1907.02908
Rather than proposing a new method, this paper investigates an issue present in existing learning algorithms. We study the learning dynamics of reinforcement learning (RL), specifically a characteristic coupling between learning and data generation t
Externí odkaz:
http://arxiv.org/abs/1904.11455
Autor:
van Hasselt, Hado, Doron, Yotam, Strub, Florian, Hessel, Matteo, Sonnerat, Nicolas, Modayil, Joseph
We know from reinforcement learning theory that temporal difference learning can fail in certain cases. Sutton and Barto (2018) identify a deadly triad of function approximation, bootstrapping, and off-policy learning. When these three properties are
Externí odkaz:
http://arxiv.org/abs/1812.02648