Zobrazeno 1 - 10
of 742
pro vyhledávání: '"Mhammedi, A."'
We consider regret minimization in low-rank MDPs with fixed transition and adversarial losses. Previous work has investigated this problem under either full-information loss feedback with unknown transitions (Zhao et al., 2024), or bandit loss feedba
Externí odkaz:
http://arxiv.org/abs/2411.06739
Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations, but the underlying (''latent'') dynamics are comparatively simple. However, outside of restrictive settings s
Externí odkaz:
http://arxiv.org/abs/2410.17904
Autor:
Mhammedi, Zakaria
In this paper, we introduce a new projection-free algorithm for Online Convex Optimization (OCO) with a state-of-the-art regret guarantee among separation-based algorithms. Existing projection-free methods based on the classical Frank-Wolfe algorithm
Externí odkaz:
http://arxiv.org/abs/2410.02476
Autor:
Pfrommer, Daniel, Padmanabhan, Swati, Ahn, Kwangjun, Umenberger, Jack, Marcucci, Tobia, Mhammedi, Zakaria, Jadbabaie, Ali
Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for
Externí odkaz:
http://arxiv.org/abs/2410.00859
Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions
Autor:
Mhammedi, Zakaria
Designing sample-efficient and computationally feasible reinforcement learning (RL) algorithms is particularly challenging in environments with large or infinite state and action spaces. In this paper, we advance this effort by presenting an efficien
Externí odkaz:
http://arxiv.org/abs/2409.04840
Publikováno v:
Lecture Notes in Networks and Systems, Volume 635 LNNS, Pages 313 - 318, 2023
Recommender systems are a kind of data filtering that guides the user to interesting and valuable resources within an extensive dataset. by providing suggestions of products that are expected to match their preferences. However, due to data overloadi
Externí odkaz:
http://arxiv.org/abs/2406.10235
Autor:
Cutkosky, Ashok, Mhammedi, Zakaria
We provide an online learning algorithm that obtains regret $G\|w_\star\|\sqrt{T\log(\|w_\star\|G\sqrt{T})} + \|w_\star\|^2 + G^2$ on $G$-Lipschitz convex losses for any comparison point $w_\star$ without knowing either $G$ or $\|w_\star\|$. Importan
Externí odkaz:
http://arxiv.org/abs/2405.20540
Autor:
Massari, Hakim El, Gherabi, Noreddine, Mhammedi, Sajida, Ghandi, Hamza, Bahaj, Mohamed, Naqvi, Muhammad Raza
Publikováno v:
International journal of online and biomedical engineering, Volume 18, Issue 11, 2022, Pages 143 - 157
Cardiovascular disease is one of the chronic diseases that is on the rise. The complications occur when cardiovascular disease is not discovered early and correctly diagnosed at the right time. Various machine learning approaches, including ontology-
Externí odkaz:
http://arxiv.org/abs/2405.20414
Simulators are a pervasive tool in reinforcement learning, but most existing algorithms cannot efficiently exploit simulator access -- particularly in high-dimensional domains that require general function approximation. We explore the power of simul
Externí odkaz:
http://arxiv.org/abs/2404.15417
A major challenge in reinforcement learning is to develop practical, sample-efficient algorithms for exploration in high-dimensional domains where generalization and function approximation is required. Low-Rank Markov Decision Processes -- where tran
Externí odkaz:
http://arxiv.org/abs/2307.03997