Výsledky vyhledávání - "Chang, Yongzhe"

Report

Morphology and Behavior Co-Optimization of Modular Satellites for Attitude Control

Autor: Wang, Yuxing, Li, Jie, Yu, Cong, Li, Xinyang, Huang, Simeng, Chang, Yongzhe, Wang, Xueqian, Liang, Bin

The emergence of modular satellites marks a significant transformation in spacecraft engineering, introducing a new paradigm of flexibility, resilience, and scalability in space exploration endeavors. In addressing complex challenges such as attitude

Externí odkaz: http://arxiv.org/abs/2409.13166

Zobrazit plný text záznamu

Report

Generalizing Alignment Paradigm of Text-to-Image Generation with Preferences through $f$-divergence Minimization

Autor: Sun, Haoyuan, Xia, Bo, Chang, Yongzhe, Wang, Xueqian

Direct Preference Optimization (DPO) has recently expanded its successful application from aligning large language models (LLMs) to aligning text-to-image models with human preferences, which has generated considerable interest within the community.

Externí odkaz: http://arxiv.org/abs/2409.09774

Zobrazit plný text záznamu

Report

Probing the Safety Response Boundary of Large Language Models via Unsafe Decoding Path Generation

Autor: Wang, Haoyu, Wu, Bingzhe, Bian, Yatao, Chang, Yongzhe, Wang, Xueqian, Zhao, Peilin

Large Language Models (LLMs) are implicit troublemakers. While they provide valuable insights and assist in problem-solving, they can also potentially serve as a resource for malicious activities. Implementing safety alignment could mitigate the risk

Externí odkaz: http://arxiv.org/abs/2408.10668

Zobrazit plný text záznamu

Report

QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning

Autor: Kong, Yilun, Mao, Hangyu, Zhao, Qi, Zhang, Bin, Ruan, Jingqing, Shen, Li, Chang, Yongzhe, Wang, Xueqian, Zhao, Rui, Tao, Dacheng

Prompt engineering has demonstrated remarkable success in enhancing the performance of large language models (LLMs) across diverse tasks. However, most existing prompt optimization methods only focus on the task-level performance, overlooking the imp

Externí odkaz: http://arxiv.org/abs/2408.10504

Zobrazit plný text záznamu

Report

DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays

Autor: Xia, Bo, Kong, Yilun, Chang, Yongzhe, Yuan, Bo, Li, Zhiheng, Wang, Xueqian, Liang, Bin

Classic reinforcement learning (RL) frequently confronts challenges in tasks involving delays, which cause a mismatch between received observations and subsequent actions, thereby deviating from the Markov assumption. Existing methods usually tackle

Externí odkaz: http://arxiv.org/abs/2406.03102

Zobrazit plný text záznamu

Report

A Method on Searching Better Activation Functions

Autor: Sun, Haoyuan, Wu, Zihao, Xia, Bo, Chang, Pu, Dong, Zibin, Yuan, Yifu, Chang, Yongzhe, Wang, Xueqian

The success of artificial neural networks (ANNs) hinges greatly on the judicious selection of an activation function, introducing non-linearity into network and enabling them to model sophisticated relationships in data. However, the search of activa

Externí odkaz: http://arxiv.org/abs/2405.12954

Zobrazit plný text záznamu

Report

Are Large Language Models Really Robust to Word-Level Perturbations?

Autor: Wang, Haoyu, Ma, Guozheng, Yu, Cong, Gui, Ning, Zhang, Linrui, Huang, Zhiqi, Ma, Suwei, Chang, Yongzhe, Zhang, Sen, Shen, Li, Wang, Xueqian, Zhao, Peilin, Tao, Dacheng

The swift advancement in the scales and capabilities of Large Language Models (LLMs) positions them as promising tools for a variety of downstream tasks. In addition to the pursuit of better performance and the avoidance of violent feedback on a cert

Externí odkaz: http://arxiv.org/abs/2309.11166

Zobrazit plný text záznamu

Report

SaFormer: A Conditional Sequence Modeling Approach to Offline Safe Reinforcement Learning

Autor: Zhang, Qin, Zhang, Linrui, Xu, Haoran, Shen, Li, Wang, Bowen, Chang, Yongzhe, Wang, Xueqian, Yuan, Bo, Tao, Dacheng

Offline safe RL is of great practical relevance for deploying agents in real-world applications. However, acquiring constraint-satisfying policies from the fixed dataset is non-trivial for conventional approaches. Even worse, the learned constraints

Externí odkaz: http://arxiv.org/abs/2301.12203

Zobrazit plný text záznamu

Report

A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning

Autor: Wang, Yuxing, Zhang, Tiantian, Chang, Yongzhe, Liang, Bin, Wang, Xueqian, Yuan, Bo

The integration of Reinforcement Learning (RL) and Evolutionary Algorithms (EAs) aims at simultaneously exploiting the sample efficiency as well as the diversity and robustness of the two paradigms. Recently, hybrid learning frameworks based on this

Externí odkaz: http://arxiv.org/abs/2201.00129

Zobrazit plný text záznamu

Report

Probability Density Estimation Based Imitation Learning

Autor: Liu, Yang, Chang, Yongzhe, Jiang, Shilei, Wang, Xueqian, Liang, Bin, Yuan, Bo

Imitation Learning (IL) is an effective learning paradigm exploiting the interactions between agents and environments. It does not require explicit reward signals and instead tries to recover desired policies using expert demonstrations. In general,

Externí odkaz: http://arxiv.org/abs/2112.06746

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání