Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Que, Haoran"'
Autor:
Wang, Zekun, Zhu, King, Xu, Chunpu, Zhou, Wangchunshu, Liu, Jiaheng, Zhang, Yibo, Wang, Jiashuo, Shi, Ning, Li, Siyu, Li, Yizhi, Que, Haoran, Zhang, Zhaoxiang, Zhang, Yuanxing, Zhang, Ge, Xu, Ke, Fu, Jie, Huang, Wenhao
In this paper, we introduce MIO, a novel foundation model built on multimodal tokens, capable of understanding and generating speech, text, images, and videos in an end-to-end, autoregressive manner. While the emergence of large language models (LLMs
Externí odkaz:
http://arxiv.org/abs/2409.17692
Autor:
Que, Haoran, Duan, Feiyu, He, Liqun, Mou, Yutao, Zhou, Wangchunshu, Liu, Jiaheng, Rong, Wenge, Wang, Zekun Moore, Yang, Jian, Zhang, Ge, Peng, Junran, Zhang, Zhaoxiang, Zhang, Songyang, Chen, Kai
In recent years, Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks (e.g., long-context understanding), and many benchmarks have been proposed. However, we observe that long text generation capabilities are not we
Externí odkaz:
http://arxiv.org/abs/2409.16191
Autor:
Liu, Jiaheng, Zhang, Chenchen, Guo, Jinyang, Zhang, Yuanxing, Que, Haoran, Deng, Ken, Bai, Zhiqi, Liu, Jie, Zhang, Ge, Wang, Jiakai, Wu, Yanan, Liu, Congnan, Su, Wenbo, Wang, Jiamang, Qu, Lin, Zheng, Bo
Despite the advanced intelligence abilities of large language models (LLMs) in various applications, they still face significant computational and storage demands. Knowledge Distillation (KD) has emerged as an effective strategy to improve the perfor
Externí odkaz:
http://arxiv.org/abs/2407.16154
Autor:
Que, Haoran, Liu, Jiaheng, Zhang, Ge, Zhang, Chenchen, Qu, Xingwei, Ma, Yinghao, Duan, Feiyu, Bai, Zhiqi, Wang, Jiakai, Zhang, Yuanxing, Tan, Xu, Fu, Jie, Su, Wenbo, Wang, Jiamang, Qu, Lin, Zheng, Bo
Continual Pre-Training (CPT) on Large Language Models (LLMs) has been widely used to expand the model's fundamental understanding of specific downstream domains (e.g., math and code). For the CPT on domain-specific LLMs, one important question is how
Externí odkaz:
http://arxiv.org/abs/2406.01375
Autor:
Liu, Jiaheng, Bai, Zhiqi, Zhang, Yuanxing, Zhang, Chenchen, Zhang, Yu, Zhang, Ge, Wang, Jiakai, Que, Haoran, Chen, Yukang, Su, Wenbo, Ge, Tiezheng, Fu, Jie, Chen, Wenhu, Zheng, Bo
Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources. Existing long-context extension methods usually need additional training procedures to support corresponding long-con
Externí odkaz:
http://arxiv.org/abs/2401.06951
Autor:
Wang, Zekun Moore, Peng, Zhongyuan, Que, Haoran, Liu, Jiaheng, Zhou, Wangchunshu, Wu, Yuhan, Guo, Hongcheng, Gan, Ruitong, Ni, Zehao, Yang, Jian, Zhang, Man, Zhang, Zhaoxiang, Ouyang, Wanli, Xu, Ke, Huang, Stephen W., Fu, Jie, Peng, Junran
The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and the
Externí odkaz:
http://arxiv.org/abs/2310.00746
Publikováno v:
In Journal of Energy Storage 1 September 2024 97 Part A