Zobrazeno 1 - 10
of 19
pro vyhledávání: '"Junmin Xiao"'
Autor:
Hang Cao, Liang Yuan, He Zhang, Yunquan Zhang, Baodong Wu, Kun Li, Shigang Li, Minghua Zhang, Pengqi Lu, Junmin Xiao
Publikováno v:
IEEE Transactions on Parallel and Distributed Systems. 34:766-780
Autor:
Junmin Xiao, Yunfei Pang, Qing Xue, Chaoyang Shui, Ke Meng, Hui Ma, Mingyi Li, Xiaoyang Zhang, Guangming Tan
Publikováno v:
SC22: International Conference for High Performance Computing, Networking, Storage and Analysis.
Autor:
Zhongzhe Hu, Junmin Xiao, Zheye Deng, Mingyi Li, Kewei Zhang, Xiaoyang Zhang, Ke Meng, Ninghui Sun, Guangming Tan
Publikováno v:
Proceedings of the 36th ACM International Conference on Supercomputing.
Publikováno v:
Concurrency and Computation: Practice and Experience. 34
Publikováno v:
Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.
Publikováno v:
International Journal of Parallel Programming. 48:750-770
Data assimilation is an analysis technique which combines observations and the numerical results from theoretical models to deduce more realistic and accurate data. It is widely used in investigations of the atmosphere, ocean and land surface. Due to
Autor:
Jian Peng, Junmin Xiao
Publikováno v:
CCF Transactions on High Performance Computing. 1:144-160
In a computing platform composed of several homogeneous processors, any parallel schedule of an algorithm usually involves three basic costs: arithmetic throughput on each processor, data movement between processors, and synchronization latency for s
Publikováno v:
PPoPP
Convolution is the most time-consuming part in the computation of convolutional neural networks (CNNs), which have achieved great successes in numerous practical applications. Due to the complex data dependency and the increase in the amount of model
Publikováno v:
SPAA
Convolution is the most time-consuming part in the computation of convolutional neural networks (CNNs). Due to the complex data dependency and the increase in the amount of model samples, the convolution suffers from high overhead on data movement. T
Autor:
Junmin Xiao, Ninghui Sun, Hu Zhongzhe, Tian Zhongbo, Zhu Hongrui, Yao Chengji, Guangming Tan, Xiaoyang Zhang
Publikováno v:
ISPA/BDCloud/SocialCom/SustainCom
Large batch distributed synchronous stochastic gradient descent (SGD) has been widely used to train deep neural networks on a distributed memory system with multi-nodes, which can leverage parallel resources to reduce the number of iterative steps an