Zobrazeno 1 - 1
of 1
pro vyhledávání: '"R, Rahul N"'
Autor:
R, Rahul N, Katewa, Vaibhav
We consider a sequential stochastic multi-armed bandit problem where the agent interacts with bandit over multiple episodes. The reward distribution of the arms remain constant throughout an episode but can change over different episodes. We propose
Externí odkaz:
http://arxiv.org/abs/2403.12428