Výsledky vyhledávání

Report

Optimal Multi-Fidelity Best-Arm Identification

Autor: Poiani, Riccardo, Degenne, Rémy, Kaufmann, Emilie, Metelli, Alberto Maria, Restelli, Marcello

In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to sample an arm

Externí odkaz: http://arxiv.org/abs/2406.03033

Zobrazit plný text záznamu

Report

Finding good policies in average-reward Markov Decision Processes without prior knowledge

Autor: Tuynman, Adrienne, Degenne, Rémy, Kaufmann, Emilie

We revisit the identification of an $\varepsilon$-optimal policy in average-reward Markov Decision Processes (MDP). In such MDPs, two measures of complexity have appeared in the literature: the diameter, $D$, and the optimal bias span, $H$, which sat

Externí odkaz: http://arxiv.org/abs/2405.17108

Zobrazit plný text záznamu

Report

Information Lower Bounds for Robust Mean Estimation

Autor: Degenne, Rémy, Mathieu, Timothée

We prove lower bounds on the error of any estimator for the mean of a real probability distribution under the knowledge that the distribution belongs to a given set. We apply these lower bounds both to parametric and nonparametric estimation. In the

Externí odkaz: http://arxiv.org/abs/2403.01892

Zobrazit plný text záznamu

Report

An $\varepsilon$-Best-Arm Identification Algorithm for Fixed-Confidence and Beyond

Autor: Jourdan, Marc, Degenne, Rémy, Kaufmann, Emilie

We propose EB-TC$\varepsilon$, a novel sampling rule for $\varepsilon$-best arm identification in stochastic bandits. It is the first instance of Top Two algorithm analyzed for approximate best arm identification. EB-TC$\varepsilon$ is an *anytime* s

Externí odkaz: http://arxiv.org/abs/2305.16041

Zobrazit plný text záznamu

Report

On the Existence of a Complexity in Fixed Budget Bandit Identification

Autor: Degenne, Rémy

In fixed budget bandit identification, an algorithm sequentially observes samples from several distributions up to a given final time. It then answers a query about the set of distributions. A good algorithm will have a small probability of error. Wh

Externí odkaz: http://arxiv.org/abs/2303.09468

Zobrazit plný text záznamu

Report

A Formalization of Doob's Martingale Convergence Theorems in mathlib

Autor: Ying, Kexing, Degenne, Rémy

We present the formalization of Doob's martingale convergence theorems in the mathlib library for the Lean theorem prover. These theorems give conditions under which (sub)martingales converge, almost everywhere or in $L^1$. In order to formalize thos

Externí odkaz: http://arxiv.org/abs/2212.05578

Zobrazit plný text záznamu

Report

Non-Asymptotic Analysis of a UCB-based Top Two Algorithm

Autor: Jourdan, Marc, Degenne, Rémy

A Top Two sampling rule for bandit identification is a method which selects the next arm to sample from among two candidate arms, a leader and a challenger. Due to their simplicity and good empirical performance, they have received increased attentio

Externí odkaz: http://arxiv.org/abs/2210.05431

Zobrazit plný text záznamu

Report

Dealing with Unknown Variances in Best-Arm Identification

Autor: Jourdan, Marc, Degenne, Rémy, Kaufmann, Emilie

The problem of identifying the best arm among a collection of items having Gaussian rewards distribution is well understood when the variances are known. Despite its practical relevance for many applications, few works studied it for unknown variance

Externí odkaz: http://arxiv.org/abs/2210.00974

Zobrazit plný text záznamu

Report

Top Two Algorithms Revisited

Autor: Jourdan, Marc, Degenne, Rémy, Baudry, Dorian, de Heide, Rianne, Kaufmann, Emilie

Top Two algorithms arose as an adaptation of Thompson sampling to best arm identification in multi-armed bandit models (Russo, 2016), for parametric families of arms. They select the next arm to sample from by randomizing among two candidate arms, a

Externí odkaz: http://arxiv.org/abs/2206.05979

Zobrazit plný text záznamu

Report

Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear Bandits

Autor: Jourdan, Marc, Degenne, Rémy

In pure-exploration problems, information is gathered sequentially to answer a question on the stochastic environment. While best-arm identification for linear bandits has been extensively studied in recent years, few works have been dedicated to ide

Externí odkaz: http://arxiv.org/abs/2206.04456

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání