Convex Methods for Constrained Linear Bandits

Autor:	Afsharrad, Amirhossein, Moradipari, Ahmadreza, Lall, Sanjay
Rok vydání:	2023
Předmět:	Computer Science - Machine Learning Electrical Engineering and Systems Science - Systems and Control
Druh dokumentu:	Working Paper
Popis:	Recently, bandit optimization has received significant attention in real-world safety-critical systems that involve repeated interactions with humans. While there exist various algorithms with performance guarantees in the literature, practical implementation of the algorithms has not received as much attention. This work presents a comprehensive study on the computational aspects of safe bandit algorithms, specifically safe linear bandits, by introducing a framework that leverages convex programming tools to create computationally efficient policies. In particular, we first characterize the properties of the optimal policy for safe linear bandit problem and then propose an end-to-end pipeline of safe linear bandit algorithms that only involves solving convex problems. We also numerically evaluate the performance of our proposed methods.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2311.04338 Zobrazit plný text záznamu View this record from Arxiv