Výsledky vyhledávání

Report

A Hypothesis on Black Swan in Unchanging Environments

Autor: Lee, Hyunin, Park, Chanwoo, Abel, David, Jin, Ming

Black swan events are statistically rare occurrences that carry extremely high risks. A typical view of defining black swan events is heavily assumed to originate from an unpredictable time-varying environments; however, the community lacks a compreh

Externí odkaz: http://arxiv.org/abs/2407.18422

Zobrazit plný text záznamu

Report

Pausing Policy Learning in Non-stationary Reinforcement Learning

Autor: Lee, Hyunin, Jin, Ming, Lavaei, Javad, Sojoudi, Somayeh

Real-time inference is a challenge of real-world reinforcement learning due to temporal differences in time-varying environments: the system collects data from the past, updates the decision model in the present, and deploys it in the future. We tack

Externí odkaz: http://arxiv.org/abs/2405.16053

Zobrazit plný text záznamu

Report

Tempo Adaptation in Non-stationary Reinforcement Learning

Autor: Lee, Hyunin, Ding, Yuhao, Lee, Jongmin, Jin, Ming, Lavaei, Javad, Sojoudi, Somayeh

We first raise and tackle a ``time synchronization'' issue between the agent and the environment in non-stationary reinforcement learning (RL), a crucial factor hindering its real-world applications. In reality, environmental changes occur over wall-

Externí odkaz: http://arxiv.org/abs/2309.14989

Zobrazit plný text záznamu

Report

Initial State Interventions for Deconfounded Imitation Learning

Autor: Pfrommer, Samuel, Bai, Yatong, Lee, Hyunin, Sojoudi, Somayeh

Imitation learning suffers from causal confusion. This phenomenon occurs when learned policies attend to features that do not causally influence the expert actions but are instead spuriously correlated. Causally confused agents produce low open-loop

Externí odkaz: http://arxiv.org/abs/2307.15980

Zobrazit plný text záznamu

Report

Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction

Autor: Ying, Donghao, Guo, Mengzi Amy, Lee, Hyunin, Ding, Yuhao, Lavaei, Javad, Shen, Zuo-Jun Max

We study Concave Constrained Markov Decision Processes (Concave CMDPs) where both the objective and constraints are defined as concave functions of the state-action occupancy measure. We propose the Variance-Reduced Primal-Dual Policy Gradient Algori

Externí odkaz: http://arxiv.org/abs/2205.10715

Zobrazit plný text záznamu

Report

Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization

Autor: Ding, Yuhao, Zhang, Junzi, Lee, Hyunin, Lavaei, Javad

Entropy regularization is an efficient technique for encouraging exploration and preventing a premature convergence of (vanilla) policy gradient methods in reinforcement learning (RL). However, the theoretical understanding of entropy-regularized RL

Externí odkaz: http://arxiv.org/abs/2110.10117

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání