Zobrazeno 1 - 10
of 126
pro vyhledávání: '"Tan, Chuanqi"'
In cross-lingual language understanding, machine translation is often utilized to enhance the transferability of models across languages, either by translating the training data from the source language to the target, or from the target to the source
Externí odkaz:
http://arxiv.org/abs/2311.06758
Autor:
Li, Chengpeng, Yuan, Zheng, Yuan, Hongyi, Dong, Guanting, Lu, Keming, Wu, Jiancan, Tan, Chuanqi, Wang, Xiang, Zhou, Chang
In math reasoning with large language models (LLMs), fine-tuning data augmentation by query evolution and diverse reasoning paths is empirically verified effective, profoundly narrowing the gap between open-sourced LLMs and cutting-edge proprietary L
Externí odkaz:
http://arxiv.org/abs/2310.05506
Autor:
Bai, Jinze, Bai, Shuai, Chu, Yunfei, Cui, Zeyu, Dang, Kai, Deng, Xiaodong, Fan, Yang, Ge, Wenbin, Han, Yu, Huang, Fei, Hui, Binyuan, Ji, Luo, Li, Mei, Lin, Junyang, Lin, Runji, Liu, Dayiheng, Liu, Gao, Lu, Chengqiang, Lu, Keming, Ma, Jianxin, Men, Rui, Ren, Xingzhang, Ren, Xuancheng, Tan, Chuanqi, Tan, Sinan, Tu, Jianhong, Wang, Peng, Wang, Shijie, Wang, Wei, Wu, Shengguang, Xu, Benfeng, Xu, Jin, Yang, An, Yang, Hao, Yang, Jian, Yang, Shusheng, Yao, Yang, Yu, Bowen, Yuan, Hongyi, Yuan, Zheng, Zhang, Jianwei, Zhang, Xingxuan, Zhang, Yichang, Zhang, Zhenru, Zhou, Chang, Zhou, Jingren, Zhou, Xiaohuan, Zhu, Tianhang
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our la
Externí odkaz:
http://arxiv.org/abs/2309.16609
Large language models (LLMs) enable in-context learning (ICL) by conditioning on a few labeled training examples as a text-based prompt, eliminating the need for parameter updates and achieving competitive performance. In this paper, we demonstrate t
Externí odkaz:
http://arxiv.org/abs/2309.14771
Autor:
Lu, Keming, Yuan, Hongyi, Yuan, Zheng, Lin, Runji, Lin, Junyang, Tan, Chuanqi, Zhou, Chang, Zhou, Jingren
Foundation language models obtain the instruction-following ability through supervised fine-tuning (SFT). Diversity and complexity are considered critical factors of a successful SFT dataset, while their definitions remain obscure and lack quantitati
Externí odkaz:
http://arxiv.org/abs/2308.07074
Autor:
Yuan, Zheng, Yuan, Hongyi, Li, Chengpeng, Dong, Guanting, Lu, Keming, Tan, Chuanqi, Zhou, Chang, Zhou, Jingren
Mathematical reasoning is a challenging task for large language models (LLMs), while the scaling relationship of it with respect to LLM capacity is under-explored. In this paper, we investigate how the pre-training loss, supervised data amount, and a
Externí odkaz:
http://arxiv.org/abs/2308.01825
Fine-tuning large pre-trained language models on various downstream tasks with whole parameters is prohibitively expensive. Hence, Parameter-efficient fine-tuning has attracted attention that only optimizes a few task-specific parameters with the fro
Externí odkaz:
http://arxiv.org/abs/2305.15212
Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs. However, despite the promisi
Externí odkaz:
http://arxiv.org/abs/2305.08732
Recent studies have demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages. In addition to involving the masked language model objective, existing cross-lingual pre-training works
Externí odkaz:
http://arxiv.org/abs/2304.08205
Reinforcement Learning from Human Feedback (RLHF) facilitates the alignment of large language models with human preferences, significantly enhancing the quality of interactions between humans and models. InstructGPT implements RLHF through several st
Externí odkaz:
http://arxiv.org/abs/2304.05302