Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Bortkiewicz, Michał"'
Autor:
Bortkiewicz, Michał, Pałucki, Władek, Myers, Vivek, Dziarmaga, Tadeusz, Arczewski, Tomasz, Kuciński, Łukasz, Eysenbach, Benjamin
Self-supervision has the potential to transform reinforcement learning (RL), paralleling the breakthroughs it has enabled in other areas of machine learning. While self-supervised learning in other domains aims to find patterns in a fixed dataset, se
Externí odkaz:
http://arxiv.org/abs/2408.11052
Autor:
Nauman, Michal, Bortkiewicz, Michał, Miłoś, Piotr, Trzciński, Tomasz, Ostaszewski, Mateusz, Cygan, Marek
Recent advancements in off-policy Reinforcement Learning (RL) have significantly improved sample efficiency, primarily due to the incorporation of various forms of regularization that enable more gradient update steps than traditional agents. However
Externí odkaz:
http://arxiv.org/abs/2403.00514
Autor:
Wołczyk, Maciej, Cupiał, Bartłomiej, Ostaszewski, Mateusz, Bortkiewicz, Michał, Zając, Michał, Pascanu, Razvan, Kuciński, Łukasz, Miłoś, Piotr
Fine-tuning is a widespread technique that allows practitioners to transfer pre-trained capabilities, as recently showcased by the successful applications of foundation models. However, fine-tuning reinforcement learning (RL) models remains a challen
Externí odkaz:
http://arxiv.org/abs/2402.02868
Autor:
Kessler, Samuel, Ostaszewski, Mateusz, Bortkiewicz, Michał, Żarski, Mateusz, Wołczyk, Maciej, Parker-Holder, Jack, Roberts, Stephen J., Miłoś, Piotr
World models power some of the most efficient reinforcement learning algorithms. In this work, we showcase that they can be harnessed for continual learning - a situation when the agent faces changing environments. World models typically employ a rep
Externí odkaz:
http://arxiv.org/abs/2211.15944
Autor:
Bortkiewicz, Michał, Łyskawa, Jakub, Wawrzyński, Paweł, Ostaszewski, Mateusz, Grudkowski, Artur, Trzciński, Tomasz
Hierarchical decomposition of control is unavoidable in large dynamical systems. In reinforcement learning (RL), it is usually solved with subgoals defined at higher policy levels and achieved at lower policy levels. Reaching these goals can take a s
Externí odkaz:
http://arxiv.org/abs/2211.06351
We introduce a new method for internal replay that modulates the frequency of rehearsal based on the depth of the network. While replay strategies mitigate the effects of catastrophic forgetting in neural networks, recent works on generative replay s
Externí odkaz:
http://arxiv.org/abs/2207.01562