Zobrazeno 1 - 10
of 181
pro vyhledávání: '"Chen, Huayu"'
Drawing upon recent advances in language model alignment, we formulate offline Reinforcement Learning as a two-stage optimization problem: First pretraining expressive generative policies on reward-free behavior datasets, then fine-tuning these polic
Externí odkaz:
http://arxiv.org/abs/2407.09024
Generative Adversarial Imitation Learning (GAIL) trains a generative policy to mimic a demonstrator. It uses on-policy Reinforcement Learning (RL) to optimize a reward signal derived from a GAN-like discriminator. A major drawback of GAIL is its trai
Externí odkaz:
http://arxiv.org/abs/2402.16349
User intentions are typically formalized as evaluation rewards to be maximized when fine-tuning language models (LMs). Existing alignment methods, such as Direct Preference Optimization (DPO), are mainly tailored for pairwise preference data where re
Externí odkaz:
http://arxiv.org/abs/2402.05369
Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow because it
Externí odkaz:
http://arxiv.org/abs/2310.07297
Guided sampling is a vital approach for applying diffusion models in real-world tasks that embeds human-defined guidance during the sampling procedure. This paper considers a general setting where the guidance is defined by an (unnormalized) energy f
Externí odkaz:
http://arxiv.org/abs/2304.12824
In offline reinforcement learning, weighted regression is a common method to ensure the learned policy stays close to the behavior policy and to prevent selecting out-of-sample actions. In this work, we show that due to the limited distributional exp
Externí odkaz:
http://arxiv.org/abs/2209.14548
Autor:
Chen, Huayu, He, Huanhuan, Zhu, Jing, Sun, Shuting, Li, Jianxiu, Shao, Xuexiao, Li, Junxiang, Li, Xiaowei, Hu, Bin
Cross-dataset emotion recognition as an extremely challenging task in the field of EEG-based affective computing is influenced by many factors, which makes the universal models yield unsatisfactory results. Facing the situation that lacks EEG informa
Externí odkaz:
http://arxiv.org/abs/2209.05849
Autor:
Chen, Huayu, Wang, Zehao, He, He, Chen, Jiadian, Yin, Hang, Yu, Dandan, Liang, Junhui, Qin, Laishun, Huang, Yuexiang, Chen, Da
Publikováno v:
In Renewable Energy May 2024 225
Autor:
Weng, Jiayi, Chen, Huayu, Yan, Dong, You, Kaichao, Duburcq, Alexis, Zhang, Minghao, Su, Yi, Su, Hang, Zhu, Jun
In this paper, we present Tianshou, a highly modularized Python library for deep reinforcement learning (DRL) that uses PyTorch as its backend. Tianshou intends to be research-friendly by providing a flexible and reliable infrastructure of DRL algori
Externí odkaz:
http://arxiv.org/abs/2107.14171
Balancing salt concentration and fluorinated cosolvent for graphite cathode-based dual-ion batteries
Autor:
Luo, Wen, Yu, Dandan, Ge, Tianqi, Yang, Jie, Dong, Shuai, Chen, Huayu, Qin, Laishun, Huang, Yuexiang, Chen, Da
Publikováno v:
In Applied Energy 15 March 2024 358