Zobrazeno 1 - 10
of 888
pro vyhledávání: '"Xie, ZhiHui"'
Autor:
Li, Lei, Wei, Yuancheng, Xie, Zhihui, Yang, Xuqing, Song, Yifan, Wang, Peiyi, An, Chenxin, Liu, Tianyu, Li, Sujian, Lin, Bill Yuchen, Kong, Lingpeng, Liu, Qi
Vision-language generative reward models (VL-GenRMs) play a crucial role in aligning and evaluating multimodal AI systems, yet their own evaluation remains under-explored. Current assessment methods primarily rely on AI-annotated preference labels fr
Externí odkaz:
http://arxiv.org/abs/2411.17451
Masked prediction has emerged as a promising pretraining paradigm in offline reinforcement learning (RL) due to its versatile masking schemes, enabling flexible inference across various downstream tasks with a unified model. Despite the versatility o
Externí odkaz:
http://arxiv.org/abs/2410.17744
Autor:
Li, Lei, Xie, Zhihui, Li, Mukai, Chen, Shunian, Wang, Peiyi, Chen, Liang, Yang, Yazheng, Wang, Benyou, Kong, Lingpeng, Liu, Qi
As large vision-language models (LVLMs) evolve rapidly, the demand for high-quality and diverse data to align these models becomes increasingly crucial. However, the creation of such data with human supervision proves costly and time-intensive. In th
Externí odkaz:
http://arxiv.org/abs/2410.09421
The widespread adoption of large language models (LLMs) has raised concerns about their safety and reliability, particularly regarding their vulnerability to adversarial attacks. In this paper, we propose a novel perspective that attributes this vuln
Externí odkaz:
http://arxiv.org/abs/2406.14393
Large language models (LLMs) have demonstrated impressive capabilities in various reasoning tasks, aided by techniques like chain-of-thought prompting that elicits verbalized reasoning. However, LLMs often generate text with obvious mistakes and cont
Externí odkaz:
http://arxiv.org/abs/2405.18711
Autor:
Reka Team, Ormazabal, Aitor, Zheng, Che, d'Autume, Cyprien de Masson, Yogatama, Dani, Fu, Deyu, Ong, Donovan, Chen, Eric, Lamprecht, Eugenie, Pham, Hai, Ong, Isaac, Aleksiev, Kaloyan, Li, Lei, Henderson, Matthew, Bain, Max, Artetxe, Mikel, Relan, Nishant, Padlewski, Piotr, Liu, Qi, Chen, Ren, Phua, Samuel, Yang, Yazheng, Tay, Yi, Wang, Yuqi, Zhu, Zhongkai, Xie, Zhihui
We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka. Reka models are able to process and reason with text, images, video, and audio inputs. This technical report discusses details of t
Externí odkaz:
http://arxiv.org/abs/2404.12387
Large pretrained multilingual language models (ML-LMs) have shown remarkable capabilities of zero-shot cross-lingual transfer, without direct cross-lingual supervision. While these results are promising, follow-up works found that, within the multili
Externí odkaz:
http://arxiv.org/abs/2401.05792
Autor:
Li, Lei, Xie, Zhihui, Li, Mukai, Chen, Shunian, Wang, Peiyi, Chen, Liang, Yang, Yazheng, Wang, Benyou, Kong, Lingpeng
This paper explores preference distillation for large vision language models (LVLMs), improving their ability to generate helpful and faithful responses anchoring the visual context. We first build a vision-language feedback (VLFeedback) dataset util
Externí odkaz:
http://arxiv.org/abs/2312.10665
Recent research in offline reinforcement learning (RL) has demonstrated that return-conditioned supervised learning is a powerful paradigm for decision-making problems. While promising, return conditioning is limited to training data labeled with rew
Externí odkaz:
http://arxiv.org/abs/2305.16683
The past few years have seen rapid progress in combining reinforcement learning (RL) with deep learning. Various breakthroughs ranging from games to robotics have spurred the interest in designing sophisticated RL algorithms and systems. However, the
Externí odkaz:
http://arxiv.org/abs/2211.03959