Abstrakt: |
Modern vector architectures are tend to be equipped with high-bandwidth memory, what makes them an interesting candidate for solving large-scale graph processing problems. However, highly irregular structure of real-world graphs makes it extremely challenging to map fundamental graph-processing problems on vector systems. This paper describes the world-first attempt, aimed to create efficient vector- friendly implementations of various connected components algorithms for modern NEC SX-Aurora TSUBASA architecture, which provides high performance computational power together with a world-highest bandwidth memory. In order to develop fast implementations, supercomputer co-design principles are used, including: the selection of vector-friendly graph algorithms, adapting these algorithms for target architecture, selecting vectorized graph storage format and applying various optimisations aimed to improve the efficiency of using memory hierarchy of target platform. In addition, current paper analyses if similar implementation approaches can be used for modern NVIDIA GPU architectures, which have many common properties and features with SX-Aurora TSUBASA. Finally, a comprehensive comparative performance analysis is presented for all algorithms, architectures and optimisations, discussed in the paper. [ABSTRACT FROM AUTHOR] |