Upper Limit Analysis of Scalable Parallel Computing on the Premise of Reliability Requirement
Autor: | Huanliang Xiong, Canghai Wu, Yefu Wang, Wei Wang, Guosun Zeng |
---|---|
Rok vydání: | 2016 |
Předmět: |
020203 distributed computing
TOP500 Markov chain Cost efficiency Computer science Process (engineering) Distributed computing Embarrassingly parallel 02 engineering and technology Parallel computing 020202 computer hardware & architecture Limit (music) Scalability 0202 electrical engineering electronic engineering information engineering Electrical and Electronic Engineering Reliability (statistics) |
Zdroj: | IETE Technical Review. 33:573-583 |
ISSN: | 0974-5971 0256-4602 |
Popis: | The Top500 supercomputers ranking has been held twice a year according to Linpack performance for more than 20 years, which greatly stimulates the development of high-performance computing. However, it is still not clear how to determine the scale limit of supercomputers. It will undoubtedly cause a waste of resources if we build bigger and bigger supercomputers without caring about other aspects of cost, energy, reliability. Thus, this paper analyses the scalability and scale limit for parallel computing with a reliability requirement. We use a Markov chain to model the state transition process of a parallel computing system, so the probability of parallel tasks running on machines successfully can be evaluated, that is the reliability of parallel computing. When parallel computing carries out an iso-speed efficiency extension under specific reliability requirements, we present an approach to calculate the maximum number of processing nodes and the maximum workload of parallel tasks, which actual... |
Databáze: | OpenAIRE |
Externí odkaz: |