Zobrazeno 1 - 10
of 92
pro vyhledávání: '"Hunt, Jonathan J."'
We propose a new class of deep reinforcement learning (RL) algorithms that model latent representations in hyperbolic space. Sequential decision-making requires reasoning about the possible future consequences of current behavior. Consequently, captu
Externí odkaz:
http://arxiv.org/abs/2210.01542
Offline reinforcement learning (RL), which aims to learn an optimal policy using a previously collected static dataset, is an important paradigm of RL. Standard RL methods often perform poorly in this regime due to the function approximation errors o
Externí odkaz:
http://arxiv.org/abs/2208.06193
Most recommender systems are myopic, that is they optimize based on the immediate response of the user. This may be misaligned with the true objective, such as creating long term user satisfaction. In this work we focus on mobile push notifications,
Externí odkaz:
http://arxiv.org/abs/2202.08812
Autor:
O'Brien, Conor, Thiagarajan, Arvind, Das, Sourav, Barreto, Rafael, Verma, Chetan, Hsu, Tim, Neufield, James, Hunt, Jonathan J
Online advertising has typically been more personalized than offline advertising, through the use of machine learning models and real-time auctions for ad targeting. One specific task, predicting the likelihood of conversion (i.e.\ the probability a
Externí odkaz:
http://arxiv.org/abs/2201.12666
Autor:
Yue, Yuguang, Xie, Yuanpu, Wu, Huasen, Jia, Haofeng, Zhai, Shaodan, Shi, Wenzhe, Hunt, Jonathan J
Listwise ranking losses have been widely studied in recommender systems. However, new paradigms of content consumption present new challenges for ranking methods. In this work we contribute an analysis of learning to rank for personalized mobile push
Externí odkaz:
http://arxiv.org/abs/2201.07681
Industrial recommender systems are frequently tasked with approximating probabilities for multiple, often closely related, user actions. For example, predicting if a user will click on an advertisement and if they will then purchase the advertised pr
Externí odkaz:
http://arxiv.org/abs/2108.13475
Autor:
Mirza, Mehdi, Jaegle, Andrew, Hunt, Jonathan J., Guez, Arthur, Tunyasuvunakool, Saran, Muldal, Alistair, Weber, Théophane, Karkus, Peter, Racanière, Sébastien, Buesing, Lars, Lillicrap, Timothy, Heess, Nicolas
Recent work in deep reinforcement learning (RL) has produced algorithms capable of mastering challenging games such as Go, chess, or shogi. In these works the RL agent directly observes the natural state of the game and controls that state directly w
Externí odkaz:
http://arxiv.org/abs/2009.05524
Composing previously mastered skills to solve novel tasks promises dramatic improvements in the data efficiency of reinforcement learning. Here, we analyze two recent works composing behaviors represented in the form of action-value functions and sho
Externí odkaz:
http://arxiv.org/abs/1812.02216
Autor:
Rae, Jack W, Hunt, Jonathan J, Harley, Tim, Danihelka, Ivo, Senior, Andrew, Wayne, Greg, Graves, Alex, Lillicrap, Timothy P
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both spac
Externí odkaz:
http://arxiv.org/abs/1610.09027
Autor:
Barreto, André, Dabney, Will, Munos, Rémi, Hunt, Jonathan J., Schaul, Tom, van Hasselt, Hado, Silver, David
Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. We propose a transfer framework for the scenario where the reward function changes between tasks but the environmen
Externí odkaz:
http://arxiv.org/abs/1606.05312