Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Ceyani, Efe Eren"'
We consider the Pareto set identification (PSI) problem in multi-objective multi-armed bandits (MO-MAB) with contaminated reward observations. At each arm pull, with some fixed probability, the true reward samples are replaced with the samples from a
Externí odkaz:
http://arxiv.org/abs/2206.02666