Zobrazeno 1 - 10
of 482
pro vyhledávání: '"Lui, John C. S."'
Publikováno v:
Proceedings of the 41st International Conference on Machine Learning. PMLR 235, 2024
We explore whether quantum advantages can be found for the zeroth-order feedback online exp-concave optimization problem, which is also known as bandit exp-concave optimization with multi-point feedback. We present quantum online quasi-Newton methods
Externí odkaz:
http://arxiv.org/abs/2410.19688
We introduce a novel framework called combinatorial logistic bandits (CLogB), where in each round, a subset of base arms (called the super arm) is selected, with the outcome of each base arm being binary and its expectation following a logistic param
Externí odkaz:
http://arxiv.org/abs/2410.17075
This paper considers the problem for finding the $(\delta,\epsilon)$-Goldstein stationary point of Lipschitz continuous objective, which is a rich function class to cover a great number of important applications. We construct a zeroth-order quantum e
Externí odkaz:
http://arxiv.org/abs/2410.16189
Learning a transition model via Maximum Likelihood Estimation (MLE) followed by planning inside the learned model is perhaps the most standard and simplest Model-based Reinforcement Learning (RL) framework. In this work, we show that such a simple Mo
Externí odkaz:
http://arxiv.org/abs/2408.08994
This paper investigates stochastic multi-armed bandit algorithms that are robust to adversarial attacks, where an attacker can first observe the learner's action and {then} alter their reward observation. We study two cases of this model, with or wit
Externí odkaz:
http://arxiv.org/abs/2408.08859
The rapid evolution of multimedia and computer vision technologies requires adaptive visual model deployment strategies to effectively handle diverse tasks and varying environments. This work introduces AxiomVision, a novel framework that can guarant
Externí odkaz:
http://arxiv.org/abs/2407.20124
We study the stochastic combinatorial semi-bandit problem with unrestricted feedback delays under merit-based fairness constraints. This is motivated by applications such as crowdsourcing, and online advertising, where immediate feedback is not immed
Externí odkaz:
http://arxiv.org/abs/2407.15439
Autor:
Liu, Xutong, Wang, Siwei, Zuo, Jinhang, Zhong, Han, Wang, Xuchuang, Wang, Zhiyong, Li, Shuai, Hajiesmaili, Mohammad, Lui, John C. S., Chen, Wei
We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a $d$-dimensional multivariant random variable and the feedback follows a g
Externí odkaz:
http://arxiv.org/abs/2406.01386
With the rapid advancement of large language models (LLMs), the diversity of multi-LLM tasks and the variability in their pricing structures have become increasingly important, as costs can vary greatly between different LLMs. To tackle these challen
Externí odkaz:
http://arxiv.org/abs/2405.16587
Conversational recommender systems have emerged as a potent solution for efficiently eliciting user preferences. These systems interactively present queries associated with "key terms" to users and leverage user feedback to estimate user preferences
Externí odkaz:
http://arxiv.org/abs/2405.02881