Asymptotically Optimal Strategies for Combinatorial Semi-Bandits in Polynomial Time
Autor: | Thibaut Cuvelier, Richard Combes, Eric Gourdin |
---|---|
Přispěvatelé: | CentraleSupélec, Orange Gardens |
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
FOS: Computer and information sciences
Computer Science::Machine Learning Computer Science - Machine Learning Machine Learning (stat.ML) [INFO.INFO-DM]Computer Science [cs]/Discrete Mathematics [cs.DM] Approximation algorithms Machine Learning (cs.LG) Combinatorial optimisation [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] Statistics - Machine Learning TheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITY Reinforcement learning [MATH.MATH-CO]Mathematics [math]/Combinatorics [math.CO] Bandit algorithms [MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC] |
Zdroj: | Proceedings of Machine Learning Research Algorithmic Learning Theory Algorithmic Learning Theory, Mar 2021, Paris, France HAL |
Popis: | We consider combinatorial semi-bandits with uncorrelated Gaussian rewards. In this article, we propose the first method, to the best of our knowledge, that enables to compute the solution of the Graves-Lai optimization problem in polynomial time for many combinatorial structures of interest. In turn, this immediately yields the first known approach to implement asymptotically optimal algorithms in polynomial time for combinatorial semi-bandits. 26 pages |
Databáze: | OpenAIRE |
Externí odkaz: |