Zobrazeno 1 - 10
of 23
pro vyhledávání: '"Klassen, Toryn Q."'
Autor:
Li, Andrew C., Chen, Zizhao, Klassen, Toryn Q., Vaezipoor, Pashootan, Icarte, Rodrigo Toro, McIlraith, Sheila A.
Reward Machines provide an automata-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing complex reward function structure, they enable counterfactual learning updates
Externí odkaz:
http://arxiv.org/abs/2406.00120
Fair decision making has largely been studied with respect to a single decision. Here we investigate the notion of fairness in the context of sequential decision making where multiple stakeholders can be affected by the outcomes of decisions. We obse
Externí odkaz:
http://arxiv.org/abs/2312.04772
Autor:
Li, Andrew C., Chen, Zizhao, Vaezipoor, Pashootan, Klassen, Toryn Q., Icarte, Rodrigo Toro, McIlraith, Sheila A.
Natural and formal languages provide an effective mechanism for humans to specify instructions and reward functions. We investigate how to generate policies via RL when reward functions are specified in a symbolic language captured by Reward Machines
Externí odkaz:
http://arxiv.org/abs/2211.10902
Autor:
Tuli, Mathieu, Li, Andrew C., Vaezipoor, Pashootan, Klassen, Toryn Q., Sanner, Scott, McIlraith, Sheila A.
Text-based games present a unique class of sequential decision making problem in which agents interact with a partially observable, simulated environment via actions and observations conveyed through natural language. Such observations typically incl
Externí odkaz:
http://arxiv.org/abs/2211.04591
Autor:
Icarte, Rodrigo Toro, Waldie, Ethan, Klassen, Toryn Q., Valenzano, Richard, Castro, Margarita P., McIlraith, Sheila A.
Reinforcement learning (RL) is a central problem in artificial intelligence. This problem consists of defining artificial agents that can learn optimal behaviour by interacting with an environment -- where the optimal behaviour is defined with respec
Externí odkaz:
http://arxiv.org/abs/2112.09477
Recent work in AI safety has highlighted that in sequential decision making, objectives are often underspecified or incomplete. This gives discretion to the acting agent to realize the stated objective in ways that may result in undesirable outcomes.
Externí odkaz:
http://arxiv.org/abs/2106.02617
Publikováno v:
Journal of Artificial Intelligence Research 73 (2022) 173-208
Reinforcement learning (RL) methods usually treat reward functions as black boxes. As such, these methods must extensively interact with the environment in order to discover rewards and optimal policies. In most RL applications, however, users have t
Externí odkaz:
http://arxiv.org/abs/2010.03950
Autor:
Icarte, Rodrigo Toro, Valenzano, Richard, Klassen, Toryn Q., Christoffersen, Phillip, Farahmand, Amir-massoud, McIlraith, Sheila A.
Reinforcement Learning (RL) agents typically learn memoryless policies---policies that only consider the last observation when selecting actions. Learning memoryless policies is efficient and optimal in fully observable environments. However, some fo
Externí odkaz:
http://arxiv.org/abs/2010.01753
Theory of Mind is commonly defined as the ability to attribute mental states (e.g., beliefs, goals) to oneself, and to others. A large body of previous work - from the social sciences to artificial intelligence - has observed that Theory of Mind capa
Externí odkaz:
http://arxiv.org/abs/2005.02963
Autor:
Toro Icarte, Rodrigo, Klassen, Toryn Q., Valenzano, Richard, Castro, Margarita P., Waldie, Ethan, McIlraith, Sheila A.
Publikováno v:
In Artificial Intelligence October 2023 323