Zobrazeno 1 - 10
of 18 550
pro vyhledávání: '"Fast rates"'
We introduce and investigate the asymptotic behaviour of the trajectories of a second order dynamical system with Tikhonov regularization for solving a monotone equation with single valued, monotone and continuous operator acting on a real Hilbert sp
Externí odkaz:
http://arxiv.org/abs/2411.17329
We consider a setting involving $N$ agents, where each agent interacts with an environment modeled as a Markov Decision Process (MDP). The agents' MDPs differ in their reward functions, capturing heterogeneous objectives/tasks. The collective goal of
Externí odkaz:
http://arxiv.org/abs/2409.05291
We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct. Our main contribution is in designing a novel learning algo
Externí odkaz:
http://arxiv.org/abs/2406.12406
Autor:
Tsuchiya, Taira, Ito, Shinji
In this paper, we explore online convex optimization (OCO) and introduce a new analysis that provides fast rates by exploiting the curvature of feasible sets. In online linear optimization, it is known that if the average gradient of loss functions i
Externí odkaz:
http://arxiv.org/abs/2402.12868
Autor:
Bonalli, Riccardo, Rudi, Alessandro
We propose a novel non-parametric learning paradigm for the identification of drift and diffusion coefficients of multi-dimensional non-linear stochastic differential equations, which relies upon discrete-time observations of the state. The key idea
Externí odkaz:
http://arxiv.org/abs/2305.15557
Bernstein's condition is a key assumption that guarantees fast rates in machine learning. For example, the Gibbs algorithm with prior $\pi$ has an excess risk in $O(d_{\pi}/n)$, as opposed to the standard $O(\sqrt{d_{\pi}/n})$, where $n$ denotes the
Externí odkaz:
http://arxiv.org/abs/2302.11709
Autor:
Tiapkin, Daniil, Belomestny, Denis, Calandriello, Daniele, Moulines, Eric, Munos, Remi, Naumov, Alexey, Perrault, Pierre, Tang, Yunhao, Valko, Michal, Menard, Pierre
We address the challenge of exploration in reinforcement learning (RL) when the agent operates in an unknown environment with sparse or no rewards. In this work, we study the maximum entropy exploration problem of two different types. The first type
Externí odkaz:
http://arxiv.org/abs/2303.08059
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Simple regret is a natural and parameter-free performance criterion for pure exploration in multi-armed bandits yet is less popular than the probability of missing the best arm or an $\epsilon$-good arm, perhaps due to lack of easy ways to characteri
Externí odkaz:
http://arxiv.org/abs/2210.16913
Autor:
Zanger, Daniel Z.
Publikováno v:
In Journal of Complexity February 2024 80