Zobrazeno 1 - 10
of 2 614
pro vyhledávání: '"Online reinforcement learning"'
Offline-to-online reinforcement learning (RL) leverages both pre-trained offline policies and online policies trained for downstream tasks, aiming to improve data efficiency and accelerate performance enhancement. An existing approach, Policy Expansi
Externí odkaz:
http://arxiv.org/abs/2410.23737
Autor:
Schiffer, Benjamin, Janson, Lucas
Many practical applications of online reinforcement learning require the satisfaction of safety constraints while learning about the unknown environment. In this work, we study Linear Quadratic Regulator (LQR) learning with unknown dynamics, but with
Externí odkaz:
http://arxiv.org/abs/2410.21081
The offline-to-online (O2O) paradigm in reinforcement learning (RL) utilizes pre-trained models on offline datasets for subsequent online fine-tuning. However, conventional O2O RL algorithms typically require maintaining and retraining the large offl
Externí odkaz:
http://arxiv.org/abs/2410.18626
Offline-to-online reinforcement learning (O2O RL) aims to obtain a continually improving policy as it interacts with the environment, while ensuring the initial behaviour is satisficing. This satisficing behaviour is necessary for robotic manipulatio
Externí odkaz:
http://arxiv.org/abs/2410.14957
Autor:
Pattanaik, Anay, Varshney, Lav R.
This paper considers an online reinforcement learning algorithm that leverages pre-collected data (passive memory) from the environment for online interaction. We show that using passive memory improves performance and further provide theoretical gua
Externí odkaz:
http://arxiv.org/abs/2410.14665
The modern paradigm in machine learning involves pre-training on diverse data, followed by task-specific fine-tuning. In reinforcement learning (RL), this translates to learning via offline RL on a diverse historical dataset, followed by rapid online
Externí odkaz:
http://arxiv.org/abs/2412.07762
The high costs and risks involved in extensive environment interactions hinder the practical application of current online safe reinforcement learning (RL) methods. While offline safe RL addresses this by learning policies from static datasets, the p
Externí odkaz:
http://arxiv.org/abs/2412.04426
Reinforcement learning algorithms are usually stated without theoretical guarantees regarding their performance. Recently, Jin, Yang, Wang, and Jordan (COLT 2020) showed a polynomial-time reinforcement learning algorithm (namely, LSVI-UCB) for the se
Externí odkaz:
http://arxiv.org/abs/2411.10906
Autor:
Trella, Anna L., Zhang, Kelly W., Jajal, Hinal, Nahum-Shani, Inbal, Shetty, Vivek, Doshi-Velez, Finale, Murphy, Susan A.
Dental disease is a prevalent chronic condition associated with substantial financial burden, personal suffering, and increased risk of systemic diseases. Despite widespread recommendations for twice-daily tooth brushing, adherence to recommended ora
Externí odkaz:
http://arxiv.org/abs/2409.02069
Offline-to-online reinforcement learning (RL), a framework that trains a policy with offline RL and then further fine-tunes it with online RL, has been considered a promising recipe for data-driven decision-making. While sensible, this framework has
Externí odkaz:
http://arxiv.org/abs/2408.14785