Popis: |
Graph-structure data is prevalent because of its ability to capture relations between real-world entities. However, graph data analyzing applications, including traditional and machine-learning-based approaches, are highly resource-demanding, necessitating massively parallel hardware like distributed clusters. Domain-specific systems, which aims to hide the hardware complexity from application users, suffers from the communication and computation efficiency problems.This thesis tackles the problems with a set of novel specialized system designs for each category of workloads. For graph analytics workloads, we propose to enforce precise loop-carried dependency propagation to reduce redundant communication and computation in our SympleGraph system. SympleGraph achieves up to 2.30x and 7.76x speedups over Gemini and D-Galios, two state-of-the-art systems. For graph pattern mining workloads, we propose to co-design the pattern decomposition algorithm and compilation techniques to improve computation efficiency, and leverage application-characteristics-aware optimizations to reduce and hide communication overhead efficiently in our DecoMine and Khuzdul systems, respectively. Our extensive experiments show that, DecoMine and Khuzdul significantly outperform previous state-of-the-art solutions. For graph neural network training, we propose to introduce pipelined model parallelism for deep model training to reduce the worst-case communication complexity by a factor of model depth. With the proposed technique, our system, GNNPipe, can reduce the communication volume by up to 22.89x and speed up the training by up to 2.45x. |