Zobrazeno 1 - 10
of 496
pro vyhledávání: '"Proutière, A."'
We consider the problem of learning an $\varepsilon$-optimal policy in controlled dynamical systems with low-rank latent structure. For this problem, we present LoRa-PI (Low-Rank Policy Iteration), a model-free learning algorithm alternating between
Externí odkaz:
http://arxiv.org/abs/2410.23434
Autor:
Zheng, Frédéric, Proutiere, Alexandre
We study the split Conformal Prediction method when applied to Markovian data. We quantify the gap in terms of coverage induced by the correlations in the data (compared to exchangeable data). This gap strongly depends on the mixing properties of the
Externí odkaz:
http://arxiv.org/abs/2407.15277
Autor:
Russo, Alessio, Proutiere, Alexandre
Publikováno v:
Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
We study the problem of exploration in Reinforcement Learning and present a novel model-free solution. We adopt an information-theoretical viewpoint and start from the instance-specific lower bound of the number of samples that have to be collected t
Externí odkaz:
http://arxiv.org/abs/2407.00801
We study contextual bandits with low-rank structure where, in each round, if the (context, arm) pair $(i,j)\in [m]\times [n]$ is selected, the learner observes a noisy sample of the $(i,j)$-th entry of an unknown low-rank reward matrix. Successive co
Externí odkaz:
http://arxiv.org/abs/2402.15739
We consider the problem of identifying the best arm in stochastic Multi-Armed Bandits (MABs) using a fixed sampling budget. Characterizing the minimal instance-specific error probability for this problem constitutes one of the important remaining ope
Externí odkaz:
http://arxiv.org/abs/2312.12137
We study matrix estimation problems arising in reinforcement learning (RL) with low-rank structure. In low-rank bandits, the matrix to be recovered specifies the expected arm rewards, and for low-rank Markov Decision Processes (MDPs), it may for exam
Externí odkaz:
http://arxiv.org/abs/2310.06793
Autor:
Tranos, Damianos, Proutiere, Alexandre
We consider the problem of adaptive Model Predictive Control (MPC) for uncertain linear-systems with additive disturbances and with state and input constraints. We present STT-MPC (Self-Tuning Tube-based Model Predictive Control), an online algorithm
Externí odkaz:
http://arxiv.org/abs/2310.04842
We study the problem of best-arm identification with fixed budget in stochastic multi-armed bandits with Bernoulli rewards. For the problem with two arms, also known as the A/B testing problem, we prove that there is no algorithm that (i) performs as
Externí odkaz:
http://arxiv.org/abs/2308.12000
We consider the problem of recovering hidden communities in the Labeled Stochastic Block Model (LSBM) with a finite number of clusters, where cluster sizes grow linearly with the total number $n$ of items. In the LSBM, a label is (independently) obse
Externí odkaz:
http://arxiv.org/abs/2306.12968
Publikováno v:
2023 62nd IEEE Conference on Decision and Control (CDC). IEEE, 2023
Reinforcement Learning aims at identifying and evaluating efficient control policies from data. In many real-world applications, the learner is not allowed to experiment and cannot gather data in an online manner (this is the case when experimenting
Externí odkaz:
http://arxiv.org/abs/2304.02574