Výsledky vyhledávání

Report

Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control

Autor: Chen, Huayu, Zheng, Kaiwen, Su, Hang, Zhu, Jun

Drawing upon recent advances in language model alignment, we formulate offline Reinforcement Learning as a two-stage optimization problem: First pretraining expressive generative policies on reward-free behavior datasets, then fine-tuning these polic

Externí odkaz: http://arxiv.org/abs/2407.09024

Zobrazit plný text záznamu

Report

C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory

Autor: Luo, Tianjiao, Pearce, Tim, Chen, Huayu, Chen, Jianfei, Zhu, Jun

Generative Adversarial Imitation Learning (GAIL) trains a generative policy to mimic a demonstrator. It uses on-policy Reinforcement Learning (RL) to optimize a reward signal derived from a GAN-like discriminator. A major drawback of GAIL is its trai

Externí odkaz: http://arxiv.org/abs/2402.16349

Zobrazit plný text záznamu

Report

Noise Contrastive Alignment of Language Models with Explicit Rewards

Autor: Chen, Huayu, He, Guande, Yuan, Lifan, Cui, Ganqu, Su, Hang, Zhu, Jun

User intentions are typically formalized as evaluation rewards to be maximized when fine-tuning language models (LMs). Existing alignment methods, such as Direct Preference Optimization (DPO), are mainly tailored for pairwise preference data where re

Externí odkaz: http://arxiv.org/abs/2402.05369

Zobrazit plný text záznamu

Report

Score Regularized Policy Optimization through Diffusion Behavior

Autor: Chen, Huayu, Lu, Cheng, Wang, Zhengyi, Su, Hang, Zhu, Jun

Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow because it

Externí odkaz: http://arxiv.org/abs/2310.07297

Zobrazit plný text záznamu

Report

Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning

Autor: Lu, Cheng, Chen, Huayu, Chen, Jianfei, Su, Hang, Li, Chongxuan, Zhu, Jun

Guided sampling is a vital approach for applying diffusion models in real-world tasks that embeds human-defined guidance during the sampling procedure. This paper considers a general setting where the guidance is defined by an (unnormalized) energy f

Externí odkaz: http://arxiv.org/abs/2304.12824

Zobrazit plný text záznamu

Report

Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

Autor: Chen, Huayu, Lu, Cheng, Ying, Chengyang, Su, Hang, Zhu, Jun

In offline reinforcement learning, weighted regression is a common method to ensure the learned policy stays close to the behavior policy and to prevent selecting out-of-sample actions. In this work, we show that due to the limited distributional exp

Externí odkaz: http://arxiv.org/abs/2209.14548

Zobrazit plný text záznamu

Report

Weight-based Channel-model Matrix Framework provides a reasonable solution for EEG-based cross-dataset emotion recognition

Autor: Chen, Huayu, He, Huanhuan, Zhu, Jing, Sun, Shuting, Li, Jianxiu, Shao, Xuexiao, Li, Junxiang, Li, Xiaowei, Hu, Bin

Cross-dataset emotion recognition as an extremely challenging task in the field of EEG-based affective computing is influenced by many factors, which makes the universal models yield unsatisfactory results. Facing the situation that lacks EEG informa

Externí odkaz: http://arxiv.org/abs/2209.05849

Zobrazit plný text záznamu

Akademický článek

Few-layer MoAlB nanosheets with Al vacancies enhanced hydroxyl adsorption for improved water oxidation kinetics

Autor: Chen, Huayu, Wang, Zehao, He, He, Chen, Jiadian, Yin, Hang, Yu, Dandan, Liang, Junhui, Qin, Laishun, Huang, Yuexiang, Chen, Da

Publikováno v: In Renewable Energy May 2024 225

Zobrazit plný text záznamu

Report

Tianshou: a Highly Modularized Deep Reinforcement Learning Library

Autor: Weng, Jiayi, Chen, Huayu, Yan, Dong, You, Kaichao, Duburcq, Alexis, Zhang, Minghao, Su, Yi, Su, Hang, Zhu, Jun

In this paper, we present Tianshou, a highly modularized Python library for deep reinforcement learning (DRL) that uses PyTorch as its backend. Tianshou intends to be research-friendly by providing a flexible and reliable infrastructure of DRL algori

Externí odkaz: http://arxiv.org/abs/2107.14171

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání