Zobrazeno 1 - 6
of 6
pro vyhledávání: '"WOLSKI, Filip"'
Understanding how knowledge about the world is represented within model-free deep reinforcement learning methods is a major challenge given the black box nature of its learning process within high-dimensional observation and action spaces. AlphaStar
Externí odkaz:
http://arxiv.org/abs/1912.06721
Autor:
OpenAI, Berner, Christopher, Brockman, Greg, Chan, Brooke, Cheung, Vicki, Dębiak, Przemysław, Dennison, Christy, Farhi, David, Fischer, Quirin, Hashme, Shariq, Hesse, Chris, Józefowicz, Rafal, Gray, Scott, Olsson, Catherine, Pachocki, Jakub, Petrov, Michael, Pinto, Henrique P. d. O., Raiman, Jonathan, Salimans, Tim, Schlatter, Jeremy, Schneider, Jonas, Sidor, Szymon, Sutskever, Ilya, Tang, Jie, Wolski, Filip, Zhang, Susan
On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions at an esports game. The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state
Externí odkaz:
http://arxiv.org/abs/1912.06680
Autor:
Houthooft, Rein, Chen, Richard Y., Isola, Phillip, Stadie, Bradly C., Wolski, Filip, Ho, Jonathan, Abbeel, Pieter
We propose a metalearning approach for learning gradient-based reinforcement learning (RL) algorithms. The idea is to evolve a differentiable loss function, such that an agent, which optimizes its policy to minimize this loss, will achieve high rewar
Externí odkaz:
http://arxiv.org/abs/1802.04821
Autor:
WOLSKI, Filip
Publikováno v:
Studia Prawno-Ekonomiczne / Studies in Law and Economics. (123):57-74
Externí odkaz:
https://www.ceeol.com/search/article-detail?id=1065578
We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent. Whereas s
Externí odkaz:
http://arxiv.org/abs/1707.06347
Autor:
Andrychowicz, Marcin, Wolski, Filip, Ray, Alex, Schneider, Jonas, Fong, Rachel, Welinder, Peter, McGrew, Bob, Tobin, Josh, Abbeel, Pieter, Zaremba, Wojciech
Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore
Externí odkaz:
http://arxiv.org/abs/1707.01495