Výsledky vyhledávání

Report

FOSP: Fine-tuning Offline Safe Policy through World Models

Autor: Cao, Chenyang, Xin, Yucheng, Wu, Silang, He, Longxiang, Yan, Zichen, Tan, Junbo, Wang, Xueqian

Model-based Reinforcement Learning (RL) has shown its high training efficiency and capability of handling high-dimensional tasks. Regarding safety issues, safe model-based RL can achieve nearly zero-cost performance and effectively manage the trade-o

Externí odkaz: http://arxiv.org/abs/2407.04942

Zobrazit plný text záznamu

Report

AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization

Autor: He, Longxiang, Shen, Li, Tan, Junbo, Wang, Xueqian

Implicit Q-learning (IQL) serves as a strong baseline for offline RL, which learns the value function using only dataset actions through quantile regression. However, it is unclear how to recover the implicit policy from the learned implicit Q-functi

Externí odkaz: http://arxiv.org/abs/2405.18187

Zobrazit plný text záznamu

Report

Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy

Autor: Cao, Chenyang, Yan, Zichen, Lu, Renhao, Tan, Junbo, Wang, Xueqian

Offline goal-conditioned reinforcement learning (GCRL) aims at solving goal-reaching tasks with sparse rewards from an offline dataset. While prior work has demonstrated various approaches for agents to learn near-optimal policies, these methods enco

Externí odkaz: http://arxiv.org/abs/2403.01734

Zobrazit plný text záznamu

Report

Geometric Structure and Polynomial-time Algorithm of Game Equilibria

Autor: Sun, Hongbo, Xia, Chongkun, Tan, Junbo, Yuan, Bo, Wang, Xueqian, Liang, Bin

Whether a PTAS (polynomial-time approximation scheme) exists for game equilibria has been an open question, and its absence has indications and consequences in three fields: the practicality of methods in algorithmic game theory, non-stationarity and

Externí odkaz: http://arxiv.org/abs/2401.00747

Zobrazit plný text záznamu

Report

DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement Learning

Autor: He, Longxiang, Shen, Li, Zhang, Linrui, Tan, Junbo, Wang, Xueqian

Constrained policy search (CPS) is a fundamental problem in offline reinforcement learning, which is generally solved by advantage weighted regression (AWR). However, previous methods may still encounter out-of-distribution actions due to the limited

Externí odkaz: http://arxiv.org/abs/2310.05333

Zobrazit plný text záznamu

Report

Hybrid Trajectory Optimization for Autonomous Terrain Traversal of Articulated Tracked Robots

Autor: Xu, Zhengzhe, Chen, Yanbo, Jian, Zhuozhu, Tan, Junbo, Wang, Xueqian, Liang, Bin

Autonomous terrain traversal of articulated tracked robots can reduce operator cognitive load to enhance task efficiency and facilitate extensive deployment. We present a novel hybrid trajectory optimization method aimed at generating efficient, stab

Externí odkaz: http://arxiv.org/abs/2306.02659

Zobrazit plný text záznamu

Report

Visuotactile Sensor Enabled Pneumatic Device Towards Compliant Oropharyngeal Swab Sampling

Autor: Li, Shoujie, He, Mingshan, Ding, Wenbo, Ye, Linqi, Wang, Xueqian, Tan, Junbo, Yuan, Jinqiu, Zhang, Xiao-Ping

Manual oropharyngeal (OP) swab sampling is an intensive and risky task. In this article, a novel OP swab sampling device of low cost and high compliance is designed by combining the visuo-tactile sensor and the pneumatic actuator-based gripper. Here,

Externí odkaz: http://arxiv.org/abs/2305.06537

Zobrazit plný text záznamu

Report

Data-Driven Robust Control for Discrete Linear Time-Invariant Systems: A Descriptor System Approach

Autor: He, Jiabao, Zhang, Xuan, Xu, Feng, Tan, Junbo, Wang, Xueqian

Given the recent surge of interest in data-driven control, this paper proposes a two-step method to study robust data-driven control for a parameter-unknown linear time-invariant (LTI) system that is affected by energy-bounded noises. First, two data

Externí odkaz: http://arxiv.org/abs/2203.06959

Zobrazit plný text záznamu

Report

Data-Driven Controllability Analysis and Stabilization for Linear Descriptor Systems

Autor: He, Jiabao, Zhang, Xuan, Xu, Feng, Tan, Junbo, Wang, Xueqian

For a parameter-unknown linear descriptor system, this paper proposes data-driven methods to testify the system's type and controllability and then to stabilize it. First, a data-based condition is developed to identify whether this unknown system is

Externí odkaz: http://arxiv.org/abs/2112.03665

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání