Výsledky vyhledávání

Report

Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

Autor: Wang, Zhiyong, Zhou, Dongruo, Lui, John C. S., Sun, Wen

Learning a transition model via Maximum Likelihood Estimation (MLE) followed by planning inside the learned model is perhaps the most standard and simplest Model-based Reinforcement Learning (RL) framework. In this work, we show that such a simple Mo

Externí odkaz: http://arxiv.org/abs/2408.08994

Zobrazit plný text záznamu

Report

Stochastic Bandits Robust to Adversarial Attacks

Autor: Wang, Xuchuang, Zuo, Jinhang, Liu, Xutong, Lui, John C. S., Hajiesmaili, Mohammad

This paper investigates stochastic multi-armed bandit algorithms that are robust to adversarial attacks, where an attacker can first observe the learner's action and {then} alter their reward observation. We study two cases of this model, with or wit

Externí odkaz: http://arxiv.org/abs/2408.08859

Zobrazit plný text záznamu

Report

AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics

Autor: Dai, Xiangxiang, Zhang, Zeyu, Yang, Peng, Xu, Yuedong, Liu, Xutong, Lui, John C. S.

The rapid evolution of multimedia and computer vision technologies requires adaptive visual model deployment strategies to effectively handle diverse tasks and varying environments. This work introduces AxiomVision, a novel framework that can guarant

Externí odkaz: http://arxiv.org/abs/2407.20124

Zobrazit plný text záznamu

Report

Merit-based Fair Combinatorial Semi-Bandit with Unrestricted Feedback Delays

Autor: Chen, Ziqun, Cai, Kechao, Chen, Zhuoyue, Zhang, Jinbei, Lui, John C. S.

We study the stochastic combinatorial semi-bandit problem with unrestricted feedback delays under merit-based fairness constraints. This is motivated by applications such as crowdsourcing, and online advertising, where immediate feedback is not immed

Externí odkaz: http://arxiv.org/abs/2407.15439

Zobrazit plný text záznamu

Report

Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond

Autor: Liu, Xutong, Wang, Siwei, Zuo, Jinhang, Zhong, Han, Wang, Xuchuang, Wang, Zhiyong, Li, Shuai, Hajiesmaili, Mohammad, Lui, John C. S., Chen, Wei

We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a $d$-dimensional multivariant random variable and the feedback follows a g

Externí odkaz: http://arxiv.org/abs/2406.01386

Zobrazit plný text záznamu

Report

Cost-Effective Online Multi-LLM Selection with Versatile Reward Models

Autor: Dai, Xiangxiang, Li, Jin, Liu, Xutong, Yu, Anqi, Lui, John C. S.

With the rapid advancement of large language models (LLMs), the diversity of multi-LLM tasks and the variability in their pricing structures have become increasingly important, as costs can vary greatly between different LLMs. To tackle these challen

Externí odkaz: http://arxiv.org/abs/2405.16587

Zobrazit plný text záznamu

Report

FedConPE: Efficient Federated Conversational Bandits with Heterogeneous Clients

Autor: Li, Zhuohua, Liu, Maoli, Lui, John C. S.

Conversational recommender systems have emerged as a potent solution for efficiently eliciting user preferences. These systems interactively present queries associated with "key terms" to users and leverage user feedback to estimate user preferences

Externí odkaz: http://arxiv.org/abs/2405.02881

Zobrazit plný text záznamu

Report

Variance-Dependent Regret Bounds for Non-stationary Linear Bandits

Autor: Wang, Zhiyong, Xie, Jize, Chen, Yi, Lui, John C. S., Zhou, Dongruo

We investigate the non-stationary stochastic linear bandit problem where the reward distribution evolves each round. Existing algorithms characterize the non-stationarity by the total variation budget $B_K$, which is the summation of the change of th

Externí odkaz: http://arxiv.org/abs/2403.10732

Zobrazit plný text záznamu

Report

Federated Contextual Cascading Bandits with Asynchronous Communication and Heterogeneous Users

Autor: Yang, Hantao, Liu, Xutong, Wang, Zhiyong, Xie, Hong, Lui, John C. S., Lian, Defu, Chen, Enhong

We study the problem of federated contextual combinatorial cascading bandits, where $|\mathcal{U}|$ agents collaborate under the coordination of a central server to provide tailored recommendations to the $|\mathcal{U}|$ corresponding users. Existing

Externí odkaz: http://arxiv.org/abs/2402.16312

Zobrazit plný text záznamu

Report

Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes

Autor: Su, Xiaoxin, Zhou, Yipeng, Cui, Laizhong, Lui, John C. S., Liu, Jiangchuan

In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds, without touching private data owned b

Externí odkaz: http://arxiv.org/abs/2402.03770

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání