Zobrazeno 1 - 10
of 1 120
pro vyhledávání: '"Wang, Zhendong"'
Aligning large language models with human preferences has emerged as a critical focus in language modeling research. Yet, integrating preference learning into Text-to-Image (T2I) generative models is still relatively uncharted territory. The Diffusio
Externí odkaz:
http://arxiv.org/abs/2406.06382
Diffusion-based text-to-image generation models trained on extensive text-image pairs have shown the capacity to generate photorealistic images consistent with textual descriptions. However, a significant limitation of these models is their slow samp
Externí odkaz:
http://arxiv.org/abs/2406.01561
Traditional language model alignment methods, such as Direct Preference Optimization (DPO), are limited by their dependence on static, pre-collected paired preference data, which hampers their adaptability and practical applicability. To overcome thi
Externí odkaz:
http://arxiv.org/abs/2405.20830
Offline reinforcement learning (RL) leverages pre-collected datasets to train optimal policies. Diffusion Q-Learning (DQL), introducing diffusion models as a powerful and expressive policy class, significantly boosts the performance of offline RL. Ho
Externí odkaz:
http://arxiv.org/abs/2405.19690
We introduce Score identity Distillation (SiD), an innovative data-free method that distills the generative capabilities of pretrained diffusion models into a single-step generator. SiD not only facilitates an exponentially fast reduction in Fr\'eche
Externí odkaz:
http://arxiv.org/abs/2404.04057
Autor:
Chen, Xuxi, Wang, Zhendong, Sow, Daouda, Yang, Junjie, Chen, Tianlong, Liang, Yingbin, Zhou, Mingyuan, Wang, Zhangyang
In the rapidly advancing arena of large language models (LLMs), a key challenge is to enhance their capabilities amid a looming shortage of high-quality training data. Our study starts from an empirical strategy for the light continual training of LL
Externí odkaz:
http://arxiv.org/abs/2402.14270
In the field of large language models (LLMs), aligning models with the diverse preferences of users is a critical challenge. Direct Preference Optimization (DPO) has played a key role in this area. It works by using pairs of preferences derived from
Externí odkaz:
http://arxiv.org/abs/2402.10958
Autor:
Bai, Chen, Shao, Zeman, Zhang, Guoxiang, Liang, Di, Yang, Jie, Zhang, Zhuorui, Guo, Yujian, Zhong, Chengzhang, Qiu, Yiqiao, Wang, Zhendong, Guan, Yichen, Zheng, Xiaoyin, Wang, Tao, Lu, Cheng
Realistic video simulation has shown significant potential across diverse applications, from virtual reality to film production. This is particularly true for scenarios where capturing videos in real-world settings is either impractical or expensive.
Externí odkaz:
http://arxiv.org/abs/2401.17509
Autor:
Chen, Tianqi, Liu, Yongfei, Wang, Zhendong, Yuan, Jianbo, You, Quanzeng, Yang, Hongxia, Zhou, Mingyuan
In light of the remarkable success of in-context learning in large language models, its potential extension to the vision domain, particularly with visual foundation models like Stable Diffusion, has sparked considerable interest. Existing approaches
Externí odkaz:
http://arxiv.org/abs/2312.01408
Among recent developments in time series forecasting methods, deep forecasting models have gained popularity as they can utilize hidden feature patterns in time series to improve forecasting performance. Nevertheless, the majority of current deep for
Externí odkaz:
http://arxiv.org/abs/2310.08137