Zobrazeno 1 - 10
of 273
pro vyhledávání: '"LIN Haibin"'
Autor:
Sheng, Guangming, Zhang, Chi, Ye, Zilingfeng, Wu, Xibin, Zhang, Wang, Zhang, Ru, Peng, Yanghua, Lin, Haibin, Wu, Chuan
Reinforcement Learning from Human Feedback (RLHF) is widely used in Large Language Model (LLM) alignment. Traditional RL can be modeled as a dataflow, where each node represents computation of a neural network (NN) and each edge denotes data dependen
Externí odkaz:
http://arxiv.org/abs/2409.19256
Multimodal large language models (MLLMs) have extended the success of large language models (LLMs) to multiple data types, such as image, text and audio, achieving significant performance in various domains, including multimodal translation, visual q
Externí odkaz:
http://arxiv.org/abs/2408.03505
Autor:
Wan, Borui, Han, Mingji, Sheng, Yiyao, Peng, Yanghua, Lin, Haibin, Zhang, Mofan, Lai, Zhichao, Yu, Menghan, Zhang, Junda, Song, Zuquan, Liu, Xin, Wu, Chuan
Checkpointing to preserve training states is crucial during the development of Large Foundation Models (LFMs), for training resumption upon various failures or changes in GPU resources and parallelism configurations. In addition, saved checkpoints ar
Externí odkaz:
http://arxiv.org/abs/2407.20143
A number of production deep learning clusters have attempted to explore inference hardware for DNN training, at the off-peak serving hours with many inference GPUs idling. Conducting DNN training with a combination of heterogeneous training and infer
Externí odkaz:
http://arxiv.org/abs/2407.02327
Autor:
Chang, Li-Wen, Bao, Wenlei, Hou, Qi, Jiang, Chengquan, Zheng, Ningxin, Zhong, Yinmin, Zhang, Xuanrun, Song, Zuquan, Jiang, Ziheng, Lin, Haibin, Jin, Xin, Liu, Xin
Large deep learning models have demonstrated strong ability to solve many tasks across a wide range of applications. Those large models typically require training and inference to be distributed. Tensor parallelism is a common technique partitioning
Externí odkaz:
http://arxiv.org/abs/2406.06858
Recent breakthroughs in Large-scale language models (LLMs) have demonstrated impressive performance on various tasks. The immense sizes of LLMs have led to very high resource demand and cost for running the models. Though the models are largely serve
Externí odkaz:
http://arxiv.org/abs/2403.01136
Autor:
Jiang, Ziheng, Lin, Haibin, Zhong, Yinmin, Huang, Qi, Chen, Yangrui, Zhang, Zhi, Peng, Yanghua, Li, Xiang, Xie, Cong, Nong, Shibiao, Jia, Yulu, He, Sun, Chen, Hongmin, Bai, Zhihao, Hou, Qi, Yan, Shipeng, Zhou, Ding, Sheng, Yiyao, Jiang, Zhuo, Xu, Haohan, Wei, Haoran, Zhang, Zhang, Nie, Pengfei, Zou, Leqi, Zhao, Sida, Xiang, Liang, Liu, Zherui, Li, Zhe, Jia, Xiaoying, Ye, Jianxi, Jin, Xin, Liu, Xin
We present the design, implementation and engineering experience in building and deploying MegaScale, a production system for training large language models (LLMs) at the scale of more than 10,000 GPUs. Training LLMs at this scale brings unprecedente
Externí odkaz:
http://arxiv.org/abs/2402.15627
Publikováno v:
EuroSys 2024
Deep Neural Networks (DNNs) have shown excellent performance in a wide range of machine learning applications. Knowing the latency of running a DNN model or tensor program on a specific device is useful in various tasks, such as DNN graph- or tensor-
Externí odkaz:
http://arxiv.org/abs/2311.09690
Autor:
Wang, Yite, Su, Jiahao, Lu, Hanlin, Xie, Cong, Liu, Tianyi, Yuan, Jianbo, Lin, Haibin, Sun, Ruoyu, Yang, Hongxia
Scaling of deep neural networks, especially Transformers, is pivotal for their surging performance and has further led to the emergence of sophisticated reasoning capabilities in foundation models. Such scaling generally requires training large model
Externí odkaz:
http://arxiv.org/abs/2310.07999
Gradient compression (GC) is a promising approach to addressing the communication bottleneck in distributed deep learning (DDL). However, it is challenging to find the optimal compression strategy for applying GC to DDL because of the intricate inter
Externí odkaz:
http://arxiv.org/abs/2205.14465