Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Llorens, Hector Urdiales"'
Contextual multi-armed bandit problems arise frequently in important industrial applications. Existing solutions model the context either linearly, which enables uncertainty driven (principled) exploration, or non-linearly, by using epsilon-greedy ex
Externí odkaz:
http://arxiv.org/abs/1807.09809