Zobrazeno 1 - 10
of 73
pro vyhledávání: '"Hausknecht, Matthew"'
Autor:
Carroll, Micah, Paradise, Orr, Lin, Jessy, Georgescu, Raluca, Sun, Mingfei, Bignell, David, Milani, Stephanie, Hofmann, Katja, Hausknecht, Matthew, Dragan, Anca, Devlin, Sam
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision-making, where many
Externí odkaz:
http://arxiv.org/abs/2211.10869
Autor:
Wagener, Nolan, Kolobov, Andrey, Frujeri, Felipe Vieira, Loynd, Ricky, Cheng, Ching-An, Hausknecht, Matthew
Simulated humanoids are an appealing research domain due to their physical capabilities. Nonetheless, they are also challenging to control, as a policy must drive an unstable, discontinuous, and high-dimensional physical system. One widely studied ap
Externí odkaz:
http://arxiv.org/abs/2208.07363
Autor:
Carroll, Micah, Lin, Jessy, Paradise, Orr, Georgescu, Raluca, Sun, Mingfei, Bignell, David, Milani, Stephanie, Hofmann, Katja, Hausknecht, Matthew, Dragan, Anca, Devlin, Sam
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many
Externí odkaz:
http://arxiv.org/abs/2204.13326
Autor:
Weir, Nathaniel, Yuan, Xingdi, Côté, Marc-Alexandre, Hausknecht, Matthew, Laroche, Romain, Momennejad, Ida, Van Seijen, Harm, Van Durme, Benjamin
Humans have the capability, aided by the expressive compositionality of their language, to learn quickly by demonstration. They are able to describe unseen task-performing procedures and generalize their execution to other contexts. In this work, we
Externí odkaz:
http://arxiv.org/abs/2203.04806
Autor:
Hausknecht, Matthew, Wagener, Nolan
Dropout has long been a staple of supervised learning, but is rarely used in reinforcement learning. We analyze why naive application of dropout is problematic for policy-gradient learning algorithms and introduce consistent dropout, a simple techniq
Externí odkaz:
http://arxiv.org/abs/2202.11818
Autor:
Mohanty, Sharada, Poonganam, Jyotish, Gaidon, Adrien, Kolobov, Andrey, Wulfe, Blake, Chakraborty, Dipam, Šemetulskis, Gražvydas, Schapke, João, Kubilius, Jonas, Pašukonis, Jurgis, Klimas, Linas, Hausknecht, Matthew, MacAlpine, Patrick, Tran, Quang Nhat, Tumiel, Thomas, Tang, Xiaocheng, Chen, Xinwei, Hesse, Christopher, Hilton, Jacob, Guss, William Hebgen, Genc, Sahika, Schulman, John, Cobbe, Karl
The NeurIPS 2020 Procgen Competition was designed as a centralized benchmark with clearly defined tasks for measuring Sample Efficiency and Generalization in Reinforcement Learning. Generalization remains one of the most fundamental challenges in dee
Externí odkaz:
http://arxiv.org/abs/2103.15332
Text-based games simulate worlds and interact with players using natural language. Recent work has used them as a testbed for autonomous language-understanding agents, with the motivation being that understanding the meanings of words or semantics is
Externí odkaz:
http://arxiv.org/abs/2103.13552
Autor:
Shridhar, Mohit, Yuan, Xingdi, Côté, Marc-Alexandre, Bisk, Yonatan, Trischler, Adam, Hausknecht, Matthew
Given a simple request like Put a washed apple in the kitchen fridge, humans can reason in purely abstract terms by imagining action sequences and scoring their likelihood of success, prototypicality, and efficiency, all without moving a muscle. Once
Externí odkaz:
http://arxiv.org/abs/2010.03768
Text-based games present a unique challenge for autonomous agents to operate in natural language and handle enormous action spaces. In this paper, we propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates a
Externí odkaz:
http://arxiv.org/abs/2010.02903
Text-based games are long puzzles or quests, characterized by a sequence of sparse and potentially deceptive rewards. They provide an ideal platform to develop agents that perceive and act upon the world using a combinatorially sized natural language
Externí odkaz:
http://arxiv.org/abs/2006.07409