Zobrazeno 1 - 10
of 30
pro vyhledávání: '"Du, Zhengxiao"'
Autor:
GLM, Team, Zeng, Aohan, Xu, Bin, Wang, Bowen, Zhang, Chenhui, Yin, Da, Rojas, Diego, Feng, Guanyu, Zhao, Hanlin, Lai, Hanyu, Yu, Hao, Wang, Hongning, Sun, Jiadai, Zhang, Jiajie, Cheng, Jiale, Gui, Jiayi, Tang, Jie, Zhang, Jing, Li, Juanzi, Zhao, Lei, Wu, Lindong, Zhong, Lucen, Liu, Mingdao, Huang, Minlie, Zhang, Peng, Zheng, Qinkai, Lu, Rui, Duan, Shuaiqi, Zhang, Shudan, Cao, Shulin, Yang, Shuxun, Tam, Weng Lam, Zhao, Wenyi, Liu, Xiao, Xia, Xiao, Zhang, Xiaohan, Gu, Xiaotao, Lv, Xin, Liu, Xinghan, Liu, Xinyi, Yang, Xinyue, Song, Xixuan, Zhang, Xunkai, An, Yifan, Xu, Yifan, Niu, Yilin, Yang, Yuantao, Li, Yueyan, Bai, Yushi, Dong, Yuxiao, Qi, Zehan, Wang, Zhaoyu, Yang, Zhen, Du, Zhengxiao, Hou, Zhenyu, Wang, Zihan
We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable model
Externí odkaz:
http://arxiv.org/abs/2406.12793
Autor:
Xu, Yifan, Liu, Xiao, Liu, Xinghan, Hou, Zhenyu, Li, Yueyan, Zhang, Xiaohan, Wang, Zihan, Zeng, Aohan, Du, Zhengxiao, Zhao, Wenyi, Tang, Jie, Dong, Yuxiao
Large language models (LLMs) have shown excellent mastering of human language, but still struggle in real-world applications that require mathematical problem-solving. While many strategies and datasets to enhance LLMs' mathematics are developed, it
Externí odkaz:
http://arxiv.org/abs/2404.02893
Autor:
Hou, Zhenyu, Niu, Yilin, Du, Zhengxiao, Zhang, Xiaohan, Liu, Xiao, Zeng, Aohan, Zheng, Qinkai, Huang, Minlie, Wang, Hongning, Tang, Jie, Dong, Yuxiao
ChatGLM is a free-to-use AI service powered by the ChatGLM family of large language models (LLMs). In this paper, we present the ChatGLM-RLHF pipeline -- a reinforcement learning from human feedback (RLHF) system -- designed to enhance ChatGLM's alig
Externí odkaz:
http://arxiv.org/abs/2404.00934
Recent studies have put into question the belief that emergent abilities in language models are exclusive to large models. This skepticism arises from two observations: 1) smaller models can also exhibit high performance on emergent abilities and 2)
Externí odkaz:
http://arxiv.org/abs/2403.15796
Autor:
Zhang, Dan, Hu, Ziniu, Zhoubian, Sining, Du, Zhengxiao, Yang, Kaiyu, Wang, Zihan, Yue, Yisong, Dong, Yuxiao, Tang, Jie
Large Language Models (LLMs) have shown promise in assisting scientific discovery. However, such applications are currently limited by LLMs' deficiencies in understanding intricate scientific concepts, deriving symbolic equations, and solving advance
Externí odkaz:
http://arxiv.org/abs/2401.07950
Autor:
Bai, Yushi, Lv, Xin, Zhang, Jiajie, Lyu, Hongchang, Tang, Jiankai, Huang, Zhidian, Du, Zhengxiao, Liu, Xiao, Zeng, Aohan, Hou, Lei, Dong, Yuxiao, Tang, Jie, Li, Juanzi
Although large language models (LLMs) demonstrate impressive performance for many language tasks, most of them can only handle texts a few thousand tokens long, limiting their applications on longer sequence inputs, such as books, reports, and codeba
Externí odkaz:
http://arxiv.org/abs/2308.14508
Autor:
Liu, Xiao, Yu, Hao, Zhang, Hanchen, Xu, Yifan, Lei, Xuanyu, Lai, Hanyu, Gu, Yu, Ding, Hangliang, Men, Kaiwen, Yang, Kejuan, Zhang, Shudan, Deng, Xiang, Zeng, Aohan, Du, Zhengxiao, Zhang, Chenhui, Shen, Sheng, Zhang, Tianjun, Su, Yu, Sun, Huan, Huang, Minlie, Dong, Yuxiao, Tang, Jie
Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has been an urgent need to evaluate LLMs as agents on challenging tasks in interacti
Externí odkaz:
http://arxiv.org/abs/2308.03688
Autor:
Liu, Xiao, Lai, Hanyu, Yu, Hao, Xu, Yifan, Zeng, Aohan, Du, Zhengxiao, Zhang, Peng, Dong, Yuxiao, Tang, Jie
We present WebGLM, a web-enhanced question-answering system based on the General Language Model (GLM). Its goal is to augment a pre-trained large language model (LLM) with web search and retrieval capabilities while being efficient for real-world dep
Externí odkaz:
http://arxiv.org/abs/2306.07906
Autor:
Zeng, Aohan, Liu, Xiao, Du, Zhengxiao, Wang, Zihan, Lai, Hanyu, Ding, Ming, Yang, Zhuoyi, Xu, Yifan, Zheng, Wendi, Xia, Xiao, Tam, Weng Lam, Ma, Zixuan, Xue, Yufei, Zhai, Jidong, Chen, Wenguang, Zhang, Peng, Dong, Yuxiao, Tang, Jie
We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as good as GPT-3 (davinci) and unveil how models of such a scale can be succe
Externí odkaz:
http://arxiv.org/abs/2210.02414
Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training. However, in the context of NLU, prior work reveals that prompt tuning does not perform well for norm
Externí odkaz:
http://arxiv.org/abs/2110.07602