Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Yoo, Seung Won Wilson"'
We consider the problem of online learning where the sequence of actions played by the learner must adhere to an unknown safety constraint at every round. The goal is to minimize regret with respect to the best safe action in hindsight while simultan
Externí odkaz:
http://arxiv.org/abs/2403.04033