Výsledky vyhledávání - "Mnih, Volodymyr"

Report

ElasticTok: Adaptive Tokenization for Image and Video

Autor: Yan, Wilson, Zaharia, Matei, Mnih, Volodymyr, Abbeel, Pieter, Faust, Aleksandra, Liu, Hao

Efficient video tokenization remains a key bottleneck in learning general purpose vision models that are capable of processing long video sequences. Prevailing approaches are restricted to encoding videos to a fixed number of tokens, where too few to

Externí odkaz: http://arxiv.org/abs/2410.08368

Zobrazit plný text záznamu

Report

Vision-Language Models as a Source of Rewards

Building generalist agents that can accomplish many goals in rich open-ended environments is one of the research frontiers for reinforcement learning. A key limiting factor for building generalist agents with RL has been the need for a large number o

Externí odkaz: http://arxiv.org/abs/2312.09187

Zobrazit plný text záznamu

Report

In-context Reinforcement Learning with Algorithm Distillation

Autor: Laskin, Michael, Wang, Luyu, Oh, Junhyuk, Parisotto, Emilio, Spencer, Stephen, Steigerwald, Richie, Strouse, DJ, Hansen, Steven, Filos, Angelos, Brooks, Ethan, Gazeau, Maxime, Sahni, Himanshu, Singh, Satinder, Mnih, Volodymyr

We propose Algorithm Distillation (AD), a method for distilling reinforcement learning (RL) algorithms into neural networks by modeling their training histories with a causal sequence model. Algorithm Distillation treats learning to reinforcement lea

Externí odkaz: http://arxiv.org/abs/2210.14215

Zobrazit plný text záznamu

Report

Palm up: Playing in the Latent Manifold for Unsupervised Pretraining

Autor: Liu, Hao, Zahavy, Tom, Mnih, Volodymyr, Singh, Satinder

Large and diverse datasets have been the cornerstones of many impressive advancements in artificial intelligence. Intelligent creatures, however, learn by interacting with the environment, which changes the input sensory signals and the state of the

Externí odkaz: http://arxiv.org/abs/2210.10913

Zobrazit plný text záznamu

Report

Wasserstein Distance Maximizing Intrinsic Control

Autor: Durugkar, Ishan, Hansen, Steven, Spencer, Stephen, Mnih, Volodymyr

This paper deals with the problem of learning a skill-conditioned policy that acts meaningfully in the absence of a reward signal. Mutual information based objectives have shown some success in learning skills that reach a diverse set of states in th

Externí odkaz: http://arxiv.org/abs/2110.15331

Zobrazit plný text záznamu

Report

Discovering Diverse Nearly Optimal Policies with Successor Features

Autor: Zahavy, Tom, O'Donoghue, Brendan, Barreto, Andre, Mnih, Volodymyr, Flennerhag, Sebastian, Singh, Satinder

Finding different solutions to the same problem is a key aspect of intelligence associated with creativity and adaptation to novel situations. In reinforcement learning, a set of diverse policies can be useful for exploration, transfer, hierarchy, an

Externí odkaz: http://arxiv.org/abs/2106.00669

Zobrazit plný text záznamu

Report

Relative Variational Intrinsic Control

Autor: Baumli, Kate, Warde-Farley, David, Hansen, Steven, Mnih, Volodymyr

In the absence of external rewards, agents can still learn useful behaviors by identifying and mastering a set of diverse skills within their environment. Existing skill learning methods use mutual information objectives to incentivize each skill to

Externí odkaz: http://arxiv.org/abs/2012.07827

Zobrazit plný text záznamu

Report

Q-Learning in enormous action spaces via amortized approximate maximization

Autor: Van de Wiele, Tom, Warde-Farley, David, Mnih, Andriy, Mnih, Volodymyr

Applying Q-learning to high-dimensional or continuous action spaces can be difficult due to the required maximization over the set of possible actions. Motivated by techniques from amortized inference, we replace the expensive maximization over all a

Externí odkaz: http://arxiv.org/abs/2001.08116

Zobrazit plný text záznamu

Report

Unsupervised Learning of Object Keypoints for Perception and Control

Autor: Kulkarni, Tejas, Gupta, Ankush, Ionescu, Catalin, Borgeaud, Sebastian, Reynolds, Malcolm, Zisserman, Andrew, Mnih, Volodymyr

The study of object representations in computer vision has primarily focused on developing representations that are useful for image classification, object detection, or semantic segmentation as downstream tasks. In this work we aim to learn object r

Externí odkaz: http://arxiv.org/abs/1906.11883

Zobrazit plný text záznamu

Report

Fast Task Inference with Variational Intrinsic Successor Features

Autor: Hansen, Steven, Dabney, Will, Barreto, Andre, Van de Wiele, Tom, Warde-Farley, David, Mnih, Volodymyr

It has been established that diverse behaviors spanning the controllable subspace of an Markov decision process can be trained by rewarding a policy for being distinguishable from other policies \citep{gregor2016variational, eysenbach2018diversity, w

Externí odkaz: http://arxiv.org/abs/1906.05030

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání