Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Ian Osband"'
Reinforcement learning agents have demonstrated remarkable achievements in simulated environments. Data efficiency, however, significantly impedes carrying this success over to real environments. The design of data-efficient agents that address this
Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may
Autor:
Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John Agapiou, Joel Leibo, Audrunas Gruslys
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d2603999dd3fc6b0bb5f7dadefd67bf1
http://arxiv.org/abs/1704.03732
http://arxiv.org/abs/1704.03732