Zobrazeno 1 - 10
of 24
pro vyhledávání: '"Fan, Shiqing"'
Autor:
He, Ethan, Khattar, Abhinav, Prenger, Ryan, Korthikanti, Vijay, Yan, Zijie, Liu, Tong, Fan, Shiqing, Aithal, Ashwath, Shoeybi, Mohammad, Catanzaro, Bryan
Upcycling pre-trained dense language models into sparse mixture-of-experts (MoE) models is an efficient approach to increase the model capacity of already trained models. However, optimal techniques for upcycling at scale remain unclear. In this work
Externí odkaz:
http://arxiv.org/abs/2410.07524
Autonomous applications are typically developed over Robot Operating System 2.0 (ROS2) even in time-critical systems like automotive. Recent years have seen increased interest in developing model-based timing analysis and schedule optimization approa
Externí odkaz:
http://arxiv.org/abs/2311.13333
Autor:
Yi, Xiaodong, Zhang, Shiwei, Diao, Lansong, Wu, Chuan, Zheng, Zhen, Fan, Shiqing, Wang, Siyu, Yang, Jun, Lin, Wei
Publikováno v:
IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 12, pp. 4694-4706, 1 Dec. 2022
This paper proposes DisCo, an automatic deep learning compilation module for data-parallel distributed training. Unlike most deep learning compilers that focus on training or inference on a single device, DisCo optimizes a DNN model for distributed t
Externí odkaz:
http://arxiv.org/abs/2209.12769
To train modern large DNN models, pipeline parallelism has recently emerged, which distributes the model across GPUs and enables different devices to process different microbatches in pipeline. Earlier pipeline designs allow multiple versions of mode
Externí odkaz:
http://arxiv.org/abs/2204.10562
Autor:
Fan, Shiqing, Luo, Ye
Low-quality face image restoration is a popular research direction in today's computer vision field. It can be used as a pre-work for tasks such as face detection and face recognition. At present, there is a lot of work to solve the problem of low-qu
Externí odkaz:
http://arxiv.org/abs/2103.02121
Convolutional neural networks (CNNs) have been used in many machine learning fields. In practical applications, the computational cost of convolutional neural networks is often high with the deepening of the network and the growth of data volume, mos
Externí odkaz:
http://arxiv.org/abs/2103.02096
Autor:
Luo, Ye, Fan, Shiqing
We present a new model of neural networks called Min-Max-Plus Neural Networks (MMP-NNs) based on operations in tropical arithmetic. In general, an MMP-NN is composed of three types of alternately stacked layers, namely linear layers, min-plus layers
Externí odkaz:
http://arxiv.org/abs/2102.06358
Autor:
Wang, Siyu, Rong, Yi, Fan, Shiqing, Zheng, Zhen, Diao, LanSong, Long, Guoping, Yang, Jun, Liu, Xiaoyong, Lin, Wei
The last decade has witnessed growth in the computational requirements for training deep neural networks. Current approaches (e.g., data/model parallelism, pipeline parallelism) parallelize training tasks onto multiple devices. However, these approac
Externí odkaz:
http://arxiv.org/abs/2007.04069
Autor:
Fan, Shiqing, Rong, Yi, Meng, Chen, Cao, Zongyan, Wang, Siyu, Zheng, Zhen, Wu, Chuan, Long, Guoping, Yang, Jun, Xia, Lixue, Diao, Lansong, Liu, Xiaoyong, Lin, Wei
It is a challenging task to train large DNN models on sophisticated GPU platforms with diversified interconnect capabilities. Recently, pipelined training has been proposed as an effective approach for improving device utilization. However, there are
Externí odkaz:
http://arxiv.org/abs/2007.01045
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.