Zobrazeno 1 - 10
of 37
pro vyhledávání: '"Curi, Sebastian"'
Autor:
Cideron, Geoffrey, Tabanpour, Baruch, Curi, Sebastian, Girgin, Sertan, Hussenot, Leonard, Dulac-Arnold, Gabriel, Geist, Matthieu, Pietquin, Olivier, Dadashi, Robert
We consider the Imitation Learning (IL) setup where expert data are not collected on the actual deployment environment but on a different version. To address the resulting distribution shift, we combine behavior cloning (BC) with a planner that is ta
Externí odkaz:
http://arxiv.org/abs/2305.01400
Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via
Externí odkaz:
http://arxiv.org/abs/2207.01337
Autor:
Sukhija, Bhavya, Köhler, Nathanael, Zamora, Miguel, Zimmermann, Simon, Curi, Sebastian, Krause, Andreas, Coros, Stelian
Trajectory optimization methods have achieved an exceptional level of performance on real-world robots in recent years. These methods heavily rely on accurate analytical models of the dynamics, yet some aspects of the physical world can only be captu
Externí odkaz:
http://arxiv.org/abs/2204.04558
Improving sample-efficiency and safety are crucial challenges when deploying reinforcement learning in high-stakes real world applications. We propose LAMBDA, a novel model-based approach for policy optimization in safety critical tasks modeled via c
Externí odkaz:
http://arxiv.org/abs/2201.09802
In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations. The robust RL f
Externí odkaz:
http://arxiv.org/abs/2103.10369
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due to the risk associated to exploration. Thus, the agent can only use data previously collected by safe policies. While previous work considers optimiz
Externí odkaz:
http://arxiv.org/abs/2102.05371
We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs. The method is closely related to the classic Relative Entropy Policy Search (REPS) algorithm of Peters et al. (201
Externí odkaz:
http://arxiv.org/abs/2010.11151
The principal task to control dynamical systems is to ensure their stability. When the system is unknown, robust approaches are promising since they aim to stabilize a large set of plausible systems simultaneously. We study linear controllers under q
Externí odkaz:
http://arxiv.org/abs/2006.11022
Model-based reinforcement learning algorithms with probabilistic dynamical models are amongst the most data-efficient learning methods. This is often attributed to their ability to distinguish between epistemic and aleatoric uncertainty. However, whi
Externí odkaz:
http://arxiv.org/abs/2006.08684
In high-stakes machine learning applications, it is crucial to not only perform well on average, but also when restricted to difficult examples. To address this, we consider the problem of training models in a risk-averse manner. We propose an adapti
Externí odkaz:
http://arxiv.org/abs/1910.12511