Výsledky vyhledávání - "Perez, Mateo"

Report

Autor: Kazemi, Milad, Perez, Mateo, Somenzi, Fabio, Soudjani, Sadegh, Trivedi, Ashutosh, Velasquez, Alvaro

We present a modular approach to \emph{reinforcement learning} (RL) in environments consisting of simpler components evolving in parallel. A monolithic view of such modular environments may be prohibitively large to learn, or may require unrealizable

Externí odkaz: http://arxiv.org/abs/2312.09938

Zobrazit plný text záznamu

Report

Omega-Regular Decision Processes

Autor: Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik

Regular decision processes (RDPs) are a subclass of non-Markovian decision processes where the transition and reward functions are guarded by some regular property of the past (a lookback). While RDPs enable intuitive and succinct representation of n

Externí odkaz: http://arxiv.org/abs/2312.08602

Zobrazit plný text záznamu

Report

A PAC Learning Algorithm for LTL and Omega-regular Objectives in MDPs

Autor: Perez, Mateo, Somenzi, Fabio, Trivedi, Ashutosh

Linear temporal logic (LTL) and omega-regular objectives -- a superset of LTL -- have seen recent use as a way to express non-Markovian objectives in reinforcement learning. We introduce a model-based probably approximately correct (PAC) learning alg

Externí odkaz: http://arxiv.org/abs/2310.12248

Zobrazit plný text záznamu

Report

Omega-Regular Reward Machines

Autor: Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik

Reinforcement learning (RL) is a powerful approach for training agents to perform tasks, but designing an appropriate reward mechanism is critical to its success. However, in many cases, the complexity of the learning objectives goes beyond the capab

Externí odkaz: http://arxiv.org/abs/2308.07469

Zobrazit plný text záznamu

Report

Policy Synthesis and Reinforcement Learning for Discounted LTL

Autor: Alur, Rajeev, Bastani, Osbert, Jothimurugan, Kishor, Perez, Mateo, Somenzi, Fabio, Trivedi, Ashutosh

The difficulty of manually specifying reward functions has led to an interest in using linear temporal logic (LTL) to express objectives for reinforcement learning (RL). However, LTL has the downside that it is sensitive to small perturbations in the

Externí odkaz: http://arxiv.org/abs/2305.17115

Zobrazit plný text záznamu

Report

Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems

Autor: Lavaei, Abolfazl, Perez, Mateo, Kazemi, Milad, Somenzi, Fabio, Soudjani, Sadegh, Trivedi, Ashutosh, Zamani, Majid

We propose a compositional approach to synthesize policies for networks of continuous-space stochastic control systems with unknown dynamics using model-free reinforcement learning (RL). The approach is based on implicitly abstracting each subsystem

Externí odkaz: http://arxiv.org/abs/2208.03485

Zobrazit plný text záznamu

Report

Recursive Reinforcement Learning

Autor: Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik

Recursion is the fundamental paradigm to finitely describe potentially infinite objects. As state-of-the-art reinforcement learning (RL) algorithms cannot directly reason about recursion, they must rely on the practitioner's ingenuity in designing a

Externí odkaz: http://arxiv.org/abs/2206.11430

Zobrazit plný text záznamu

Report

Alternating Good-for-MDP Automata

Autor: Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik

When omega-regular objectives were first proposed in model-free reinforcement learning (RL) for controlling MDPs, deterministic Rabin automata were used in an attempt to provide a direct translation from their transitions to scalar values. While thes

Externí odkaz: http://arxiv.org/abs/2205.03243

Zobrazit plný text záznamu

Report

Mungojerrie: Reinforcement Learning of Linear-Time Objectives

Autor: Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik

Reinforcement learning synthesizes controllers without prior knowledge of the system. At each timestep, a reward is given. The controllers optimize the discounted sum of these rewards. Applying this class of algorithms requires designing a reward sch

Externí odkaz: http://arxiv.org/abs/2106.09161

Zobrazit plný text záznamu

Report

Model-free Reinforcement Learning for Branching Markov Decision Processes

Autor: Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik

We study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of various type

Externí odkaz: http://arxiv.org/abs/2106.06777

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání