Zobrazeno 1 - 10
of 175
pro vyhledávání: '"BAI Yushi"'
Autor:
Zhang, Jiajie, Bai, Yushi, Lv, Xin, Gu, Wanjun, Liu, Danqing, Zou, Minhao, Cao, Shulin, Hou, Lei, Dong, Yuxiao, Feng, Ling, Li, Juanzi
Though current long-context large language models (LLMs) have demonstrated impressive capacities in answering user questions based on extensive text, the lack of citations in their responses makes user verification difficult, leading to concerns abou
Externí odkaz:
http://arxiv.org/abs/2409.02897
Autor:
Bai, Yushi, Zhang, Jiajie, Lv, Xin, Zheng, Linzhi, Zhu, Siqi, Hou, Lei, Dong, Yuxiao, Tang, Jie, Li, Juanzi
Current long context large language models (LLMs) can process inputs up to 100,000 tokens, yet struggle to generate outputs exceeding even a modest length of 2,000 words. Through controlled experiments, we find that the model's effective generation l
Externí odkaz:
http://arxiv.org/abs/2408.07055
Large language models (LLMs) excel in various capabilities but also pose safety risks such as generating harmful content and misinformation, even after safety alignment. In this paper, we explore the inner mechanisms of safety alignment from the pers
Externí odkaz:
http://arxiv.org/abs/2406.14144
Autor:
GLM, Team, Zeng, Aohan, Xu, Bin, Wang, Bowen, Zhang, Chenhui, Yin, Da, Zhang, Dan, Rojas, Diego, Feng, Guanyu, Zhao, Hanlin, Lai, Hanyu, Yu, Hao, Wang, Hongning, Sun, Jiadai, Zhang, Jiajie, Cheng, Jiale, Gui, Jiayi, Tang, Jie, Zhang, Jing, Sun, Jingyu, Li, Juanzi, Zhao, Lei, Wu, Lindong, Zhong, Lucen, Liu, Mingdao, Huang, Minlie, Zhang, Peng, Zheng, Qinkai, Lu, Rui, Duan, Shuaiqi, Zhang, Shudan, Cao, Shulin, Yang, Shuxun, Tam, Weng Lam, Zhao, Wenyi, Liu, Xiao, Xia, Xiao, Zhang, Xiaohan, Gu, Xiaotao, Lv, Xin, Liu, Xinghan, Liu, Xinyi, Yang, Xinyue, Song, Xixuan, Zhang, Xunkai, An, Yifan, Xu, Yifan, Niu, Yilin, Yang, Yuantao, Li, Yueyan, Bai, Yushi, Dong, Yuxiao, Qi, Zehan, Wang, Zhaoyu, Yang, Zhen, Du, Zhengxiao, Hou, Zhenyu, Wang, Zihan
We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable model
Externí odkaz:
http://arxiv.org/abs/2406.12793
The advancement of large language models (LLMs) relies on evaluation using public benchmarks, but data contamination can lead to overestimated performance. Previous researches focus on detecting contamination by determining whether the model has seen
Externí odkaz:
http://arxiv.org/abs/2406.04197
To advance the evaluation of multimodal math reasoning in large multimodal models (LMMs), this paper introduces a novel benchmark, MM-MATH. MM-MATH consists of 5,929 open-ended middle school math problems with visual contexts, with fine-grained class
Externí odkaz:
http://arxiv.org/abs/2404.05091
Autor:
Ying, Jiahao, Cao, Yixin, Bai, Yushi, Sun, Qianru, Wang, Bo, Tang, Wei, Ding, Zhaojun, Yang, Yizhe, Huang, Xuanjing, Yan, Shuicheng
Large language models (LLMs) have achieved impressive performance across various natural language benchmarks, prompting a continual need to curate more difficult datasets for larger LLMs, which is costly and time-consuming. In this paper, we propose
Externí odkaz:
http://arxiv.org/abs/2402.11894
Autor:
Qi, Ji, Ding, Ming, Wang, Weihan, Bai, Yushi, Lv, Qingsong, Hong, Wenyi, Xu, Bin, Hou, Lei, Li, Juanzi, Dong, Yuxiao, Tang, Jie
Vision-Language Models (VLMs) have demonstrated their broad effectiveness thanks to extensive training in aligning visual instructions to responses. However, such training of conclusive alignment leads models to ignore essential visual reasoning, fur
Externí odkaz:
http://arxiv.org/abs/2402.04236
Autor:
Bai, Yushi, Lv, Xin, Zhang, Jiajie, He, Yuze, Qi, Ji, Hou, Lei, Tang, Jie, Dong, Yuxiao, Li, Juanzi
Extending large language models to effectively handle long contexts requires instruction fine-tuning on input sequences of similar length. To address this, we present LongAlign -- a recipe of the instruction data, training, and evaluation for long co
Externí odkaz:
http://arxiv.org/abs/2401.18058
Autor:
He, Yuze, Bai, Yushi, Lin, Matthieu, Sheng, Jenny, Hu, Yubin, Wang, Qi, Wen, Yu-Hui, Liu, Yong-Jin
By lifting the pre-trained 2D diffusion models into Neural Radiance Fields (NeRFs), text-to-3D generation methods have made great progress. Many state-of-the-art approaches usually apply score distillation sampling (SDS) to optimize the NeRF represen
Externí odkaz:
http://arxiv.org/abs/2312.11774