Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Baudry, D. (Dorian)"'
Autor:
Baudry, D. (Dorian)
Un bandit est un problème d’apprentissage dans lequel un agent choisit séquentiellement de tester une action parmi un ensemble de candidats fixé, collecte une récompense, et met en place une stratégie dans le but de maximiser son gain cumulé.
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=od______4198::d7e22c977dbaaaa57f8345f7ead59718
http://hdl.handle.net/20.500.12210/79821
http://hdl.handle.net/20.500.12210/79821
The stochastic multi-arm bandit problem has been extensively studied under standard assumptions on the arm's distribution (e.g bounded with known support, exponential family, etc). These assumptions are suitable for many real-world problems but somet
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=od______4198::b5171ea11ffe4ca0d076a9bff57668c7
http://hdl.handle.net/20.500.12210/57852
http://hdl.handle.net/20.500.12210/57852