Adaptive Communication for Distributed Deep Learning on Commodity GPU Cluster
Autor: | Li-Yung Ho, Jan-Jan Wu, Pangfeng Liu |
---|---|
Rok vydání: | 2018 |
Předmět: |
Speedup
Artificial neural network Computer science business.industry Deep learning Distributed computing 02 engineering and technology GPU cluster 010501 environmental sciences 01 natural sciences Data modeling Stochastic gradient descent 020204 information systems Server 0202 electrical engineering electronic engineering information engineering Artificial intelligence business 0105 earth and related environmental sciences Data transmission |
Zdroj: | CCGrid |
DOI: | 10.1109/ccgrid.2018.00043 |
Popis: | Deep learning is now the most promising approach to develop human-intelligent computer systems. To speedup the development of neural networks, researchers have designed many distributed learning algorithms to facilitate the training process. In these algorithms, people use a constant to indicate the communication period for model/gradient exchange. We find that this type of communication pattern could incur unnecessary and inefficient data transmission for some training methods e.g., elastic SGD and gossiping SGD. In this paper, we propose an adaptive communication method to improve the performance of gossiping SGD. Instead of using a fixed period for model exchange, we exchange the models with other machines according to the change of the local model. This makes the communication more efficient and thus improves the performance. The experiment results show that our method reduces the communication traffic by 92%, which results in 52% reduction in training time while preserving the prediction accuracy compared with gossiping SGD. |
Databáze: | OpenAIRE |
Externí odkaz: |