Zobrazeno 1 - 10
of 48
pro vyhledávání: '"Modi, Aditya"'
We present a novel method, Contextual goal-Oriented Data Augmentation (CODA), which uses commonly available unlabeled trajectories and context-goal pairs to solve Contextual Goal-Oriented (CGO) problems. By carefully constructing an action-augmented
Externí odkaz:
http://arxiv.org/abs/2408.07753
We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions. On the positive side, we propose the RFOLIVE (Reward
Externí odkaz:
http://arxiv.org/abs/2206.10770
Learning-based control of linear systems received a lot of attentions recently. In popular settings, the true dynamical models are unknown to the decision-maker and need to be interactively learned by applying control inputs to the systems. Unlike th
Externí odkaz:
http://arxiv.org/abs/2201.01387
Linear time-invariant systems are very popular models in system theory and applications. A fundamental problem in system identification that remains rather unaddressed in extant literature is to leverage commonalities amongst related linear systems t
Externí odkaz:
http://arxiv.org/abs/2112.10955
Publikováno v:
In Automatica June 2024 164
The low rank MDP has emerged as an important model for studying representation learning and exploration in reinforcement learning. With a known representation, several model-free exploration strategies exist. In contrast, all algorithms for the unkno
Externí odkaz:
http://arxiv.org/abs/2102.07035
Standard reinforcement learning (RL) aims to find an optimal policy that identifies the best action for each state. However, in healthcare settings, many actions may be near-equivalent with respect to the reward (e.g., survival). We consider an alter
Externí odkaz:
http://arxiv.org/abs/2007.12678
Reinforcement learning (RL) methods have been shown to be capable of learning intelligent behavior in rich domains. However, this has largely been done in simulated domains without adequate focus on the process of building the simulator. In this pape
Externí odkaz:
http://arxiv.org/abs/1910.10597
Autor:
Modi, Aditya, Dey, Debadeepta, Agarwal, Alekh, Swaminathan, Adith, Nushi, Besmira, Andrist, Sean, Horvitz, Eric
Assemblies of modular subsystems are being pressed into service to perform sensing, reasoning, and decision making in high-stakes, time-critical tasks in such areas as transportation, healthcare, and industrial automation. We address the opportunity
Externí odkaz:
http://arxiv.org/abs/1905.05179
Autor:
Modi, Aditya, Tewari, Ambuj
We consider the recently proposed reinforcement learning (RL) framework of Contextual Markov Decision Processes (CMDP), where the agent interacts with a (potentially adversarial) sequence of episodic tabular MDPs. In addition, a context vector determ
Externí odkaz:
http://arxiv.org/abs/1903.06187