Zobrazeno 1 - 10
of 535
pro vyhledávání: '"Perez, Mateo"'
Autor:
Kazemi, Milad, Perez, Mateo, Somenzi, Fabio, Soudjani, Sadegh, Trivedi, Ashutosh, Velasquez, Alvaro
We present a modular approach to \emph{reinforcement learning} (RL) in environments consisting of simpler components evolving in parallel. A monolithic view of such modular environments may be prohibitively large to learn, or may require unrealizable
Externí odkaz:
http://arxiv.org/abs/2312.09938
Autor:
Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik
Regular decision processes (RDPs) are a subclass of non-Markovian decision processes where the transition and reward functions are guarded by some regular property of the past (a lookback). While RDPs enable intuitive and succinct representation of n
Externí odkaz:
http://arxiv.org/abs/2312.08602
Linear temporal logic (LTL) and omega-regular objectives -- a superset of LTL -- have seen recent use as a way to express non-Markovian objectives in reinforcement learning. We introduce a model-based probably approximately correct (PAC) learning alg
Externí odkaz:
http://arxiv.org/abs/2310.12248
Autor:
Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik
Reinforcement learning (RL) is a powerful approach for training agents to perform tasks, but designing an appropriate reward mechanism is critical to its success. However, in many cases, the complexity of the learning objectives goes beyond the capab
Externí odkaz:
http://arxiv.org/abs/2308.07469
Autor:
Alur, Rajeev, Bastani, Osbert, Jothimurugan, Kishor, Perez, Mateo, Somenzi, Fabio, Trivedi, Ashutosh
The difficulty of manually specifying reward functions has led to an interest in using linear temporal logic (LTL) to express objectives for reinforcement learning (RL). However, LTL has the downside that it is sensitive to small perturbations in the
Externí odkaz:
http://arxiv.org/abs/2305.17115
Autor:
Lavaei, Abolfazl, Perez, Mateo, Kazemi, Milad, Somenzi, Fabio, Soudjani, Sadegh, Trivedi, Ashutosh, Zamani, Majid
We propose a compositional approach to synthesize policies for networks of continuous-space stochastic control systems with unknown dynamics using model-free reinforcement learning (RL). The approach is based on implicitly abstracting each subsystem
Externí odkaz:
http://arxiv.org/abs/2208.03485
Autor:
Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik
Recursion is the fundamental paradigm to finitely describe potentially infinite objects. As state-of-the-art reinforcement learning (RL) algorithms cannot directly reason about recursion, they must rely on the practitioner's ingenuity in designing a
Externí odkaz:
http://arxiv.org/abs/2206.11430
Autor:
Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik
When omega-regular objectives were first proposed in model-free reinforcement learning (RL) for controlling MDPs, deterministic Rabin automata were used in an attempt to provide a direct translation from their transitions to scalar values. While thes
Externí odkaz:
http://arxiv.org/abs/2205.03243
Autor:
Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik
Reinforcement learning synthesizes controllers without prior knowledge of the system. At each timestep, a reward is given. The controllers optimize the discounted sum of these rewards. Applying this class of algorithms requires designing a reward sch
Externí odkaz:
http://arxiv.org/abs/2106.09161
Autor:
Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik
We study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of various type
Externí odkaz:
http://arxiv.org/abs/2106.06777