Zobrazeno 1 - 10
of 110
pro vyhledávání: '"Vojnović, Milan"'
In this study, we consider the infinitely many-armed bandit problems in a rested rotting setting, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged. We explore two scenarios regarding the rotting of re
Externí odkaz:
http://arxiv.org/abs/2404.14202
We investigate the convergence rates and data sample sizes required for training a machine learning model using a stochastic gradient descent (SGD) algorithm, where data points are sampled based on either their loss value or uncertainty value. These
Externí odkaz:
http://arxiv.org/abs/2312.13927
We consider a combinatorial multi-armed bandit problem for maximum value reward function under maximum value and index feedback. This is a new feedback structure that lies in between commonly studied semi-bandit and full-bandit feedback structures. W
Externí odkaz:
http://arxiv.org/abs/2305.16074
Autor:
Yi, Jialin, Vojnović, Milan
Publikováno v:
Proceedings of the 40th International Conference on Machine Learning 2023
We study a new non-stochastic federated multi-armed bandit problem with multiple agents collaborating via a communication network. The losses of the arms are assigned by an oblivious adversary that specifies the loss of each arm not only for each tim
Externí odkaz:
http://arxiv.org/abs/2301.09223
Autor:
Yi, Jialin, Vojnović, Milan
Publikováno v:
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems 1329 1335
We consider the nonstochastic multi-agent multi-armed bandit problem with agents collaborating via a communication network with delays. We show a lower bound for individual regret of all agents. We show that with suitable regularizers and communicati
Externí odkaz:
http://arxiv.org/abs/2211.17154
Autor:
Vojnovic, Milan, Zhou, Kaifang
We consider a discrete-time voter model process on a set of nodes, each being in one of two states, either 0 or 1. In each time step, each node adopts the state of a randomly sampled neighbor according to sampling probabilities, referred to as node i
Externí odkaz:
http://arxiv.org/abs/2211.13628
We consider the infinitely many-armed bandit problem with rotting rewards, where the mean reward of an arm decreases at each pull of the arm according to an arbitrary trend with maximum rotting rate $\varrho=o(1)$. We show that this learning problem
Externí odkaz:
http://arxiv.org/abs/2201.12975
Autor:
Kim, Jung-hun, Vojnovic, Milan
We address a control system optimization problem that arises in multi-class, multi-server queueing system scheduling with uncertainty. In this scenario, jobs incur holding costs while awaiting completion, and job-server assignments yield observable s
Externí odkaz:
http://arxiv.org/abs/2112.06362
Publikováno v:
Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021
Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on t
Externí odkaz:
http://arxiv.org/abs/2108.00230