Výsledky vyhledávání

Report

Fast Rates for Bandit PAC Multiclass Classification

Autor: Erez, Liad, Cohen, Alon, Koren, Tomer, Mansour, Yishay, Moran, Shay

We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct. Our main contribution is in designing a novel learning algo

Externí odkaz: http://arxiv.org/abs/2406.12406

Zobrazit plný text záznamu

Report

The Real Price of Bandit Information in Multiclass Classification

Autor: Erez, Liad, Cohen, Alon, Koren, Tomer, Mansour, Yishay, Moran, Shay

We revisit the classical problem of multiclass classification with bandit feedback (Kakade, Shalev-Shwartz and Tewari, 2008), where each input classifies to one of $K$ possible labels and feedback is restricted to whether the predicted label is corre

Externí odkaz: http://arxiv.org/abs/2405.10027

Zobrazit plný text záznamu

Report

Locally Optimal Descent for Dynamic Stepsize Scheduling

Autor: Yehudai, Gilad, Cohen, Alon, Daniely, Amit, Drori, Yoel, Koren, Tomer, Schain, Mariano

We introduce a novel dynamic learning-rate scheduling scheme grounded in theory with the goal of simplifying the manual and time-consuming tuning of schedules in practice. Our approach is based on estimating the locally-optimal stepsize, guaranteeing

Externí odkaz: http://arxiv.org/abs/2311.13877

Zobrazit plný text záznamu

Report

Rate-Optimal Policy Optimization for Linear Markov Decision Processes

Autor: Sherman, Uri, Cohen, Alon, Koren, Tomer, Mansour, Yishay

We study regret minimization in online episodic linear Markov Decision Processes, and obtain rate-optimal $\widetilde O (\sqrt K)$ regret where $K$ denotes the number of episodes. Our work is the first to establish the optimal (w.r.t.~$K$) rate of co

Externí odkaz: http://arxiv.org/abs/2308.14642

Zobrazit plný text záznamu

Report

APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT

Autor: Galler, Hadar Schreiber, Zahavy, Tom, Desjardins, Guillaume, Cohen, Alon

We study diverse skill discovery in reward-free environments, aiming to discover all possible skills in simple grid-world environments where prior methods have struggled to succeed. This problem is formulated as mutual training of skills using an int

Externí odkaz: http://arxiv.org/abs/2308.12649

Zobrazit plný text záznamu

Report

Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation

Autor: Levy, Orin, Cohen, Alon, Cassel, Asaf, Mansour, Yishay

We present the OMG-CMDP! algorithm for regret minimization in adversarial Contextual MDPs. The algorithm operates under the minimal assumptions of realizable function class and access to online least squares and log loss regression oracles. Our algor

Externí odkaz: http://arxiv.org/abs/2303.01464

Zobrazit plný text záznamu

Report

Eluder-based Regret for Stochastic Contextual MDPs

Autor: Levy, Orin, Cassel, Asaf, Cohen, Alon, Mansour, Yishay

We present the E-UC$^3$RL algorithm for regret minimization in Stochastic Contextual Markov Decision Processes (CMDPs). The algorithm operates under the minimal assumptions of realizable function class and access to \emph{offline} least squares and l

Externí odkaz: http://arxiv.org/abs/2211.14932

Zobrazit plný text záznamu

Report

Rate-Optimal Online Convex Optimization in Adaptive Linear Control

Autor: Cassel, Asaf, Cohen, Alon, Koren, Tomer

We consider the problem of controlling an unknown linear dynamical system under adversarially changing convex costs and full feedback of both the state and cost function. We present the first computationally-efficient algorithm that attains an optima

Externí odkaz: http://arxiv.org/abs/2206.01426

Zobrazit plný text záznamu

Report

Efficient Online Linear Control with Stochastic Convex Costs and Unknown Dynamics

Autor: Cassel, Asaf, Cohen, Alon, Koren, Tomer

We consider the problem of controlling an unknown linear dynamical system under a stochastic convex cost and full feedback of both the state and cost function. We present a computationally efficient algorithm that attains an optimal $\sqrt{T}$ regret

Externí odkaz: http://arxiv.org/abs/2203.01170

Zobrazit plný text záznamu

Report

Asynchronous Stochastic Optimization Robust to Arbitrary Delays

Autor: Cohen, Alon, Daniely, Amit, Drori, Yoel, Koren, Tomer, Schain, Mariano

We consider stochastic optimization with delayed gradients where, at each time step $t$, the algorithm makes an update using a stale stochastic gradient from step $t - d_t$ for some arbitrary delay $d_t$. This setting abstracts asynchronous distribut

Externí odkaz: http://arxiv.org/abs/2106.11879

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání