Zobrazeno 1 - 10
of 629
pro vyhledávání: '"WU, CHENGWEI"'
Autor:
Wang, Liangdong, Zhang, Bo-Wen, Wu, Chengwei, Zhao, Hanyu, Shi, Xiaofeng, Gu, Shuhao, Li, Jijie, Ma, Quanyue, Pan, TengFei, Liu, Guang
We present CCI3.0-HQ (https://huggingface.co/datasets/BAAI/CCI3-HQ), a high-quality 500GB subset of the Chinese Corpora Internet 3.0 (CCI3.0)(https://huggingface.co/datasets/BAAI/CCI3-Data), developed using a novel two-stage hybrid filtering pipeline
Externí odkaz:
http://arxiv.org/abs/2410.18505
With the availability of various instruction datasets, a pivotal challenge is how to effectively select and integrate these instructions to fine-tune large language models (LLMs). Previous research mainly focuses on selecting individual high-quality
Externí odkaz:
http://arxiv.org/abs/2409.07045
Autor:
Zhang, Bo-Wen, Wang, Liangdong, Yuan, Ye, Li, Jijie, Gu, Shuhao, Zhao, Mengdi, Wu, Xinya, Liu, Guang, Wu, Chengwei, Zhao, Hanyu, Du, Li, Ju, Yiming, Ma, Quanyue, Ao, Yulong, Zhao, Yingli, Zhu, Songhe, Cao, Zhou, Liang, Dong, Lin, Yonghua, Zhang, Ming, Wang, Shunfei, Zhou, Yanxin, Ye, Min, Chen, Xuekai, Yu, Xinyang, Huang, Xiangjun, Yang, Jian
In recent years, with the rapid application of large language models across various fields, the scale of these models has gradually increased, and the resources required for their pre-training have grown exponentially. Training an LLM from scratch wi
Externí odkaz:
http://arxiv.org/abs/2408.06567
Autor:
Sun, Hongda, Liu, Yuxuan, Wu, Chengwei, Yan, Haiyu, Tai, Cheng, Gao, Xin, Shang, Shuo, Yan, Rui
Open-domain question answering (ODQA) has emerged as a pivotal research spotlight in information systems. Existing methods follow two main paradigms to collect evidence: (1) The \textit{retrieve-then-read} paradigm retrieves pertinent documents from
Externí odkaz:
http://arxiv.org/abs/2403.05217
Publikováno v:
IET Cyber-Systems and Robotics, 2023, 5(1), e12066
Recurrent Neural Network, Long Short-Term Memory, and Transformer have made great progress in predicting the trajectories of moving objects. Although the trajectory element with the surrounding scene features has been merged to improve performance, t
Externí odkaz:
http://arxiv.org/abs/2304.07711
Publikováno v:
In Construction and Building Materials 15 November 2024 451
Publikováno v:
In Carbohydrate Polymers 15 October 2024 342
Publikováno v:
In International Journal of Biological Macromolecules January 2025 287
Publikováno v:
In European Journal of Mechanics / A Solids January-February 2025 109