On optimizing operator fusion plans for large-scale machine learning in systemML
Autor: | Prithviraj Sen, Matthias Boehm, Alexandre V. Evfimievski, Berthold Reinwald, Dylan Hutchison, Niketan Pansare |
---|---|
Rok vydání: | 2018 |
Předmět: |
Fusion
Scale (ratio) Selection (relational algebra) business.industry Computer science General Engineering 020207 software engineering 02 engineering and technology Machine learning computer.software_genre Operator (computer programming) 020204 information systems Linear algebra 0202 electrical engineering electronic engineering information engineering Overhead (computing) Code generation Artificial intelligence business Heuristics computer |
Zdroj: | Proceedings of the VLDB Endowment. 11:1755-1768 |
ISSN: | 2150-8097 |
DOI: | 10.14778/3229863.3229865 |
Popis: | Many machine learning (ML) systems allow the specification of ML algorithms by means of linear algebra programs, and automatically generate efficient execution plans. The opportunities for fused operators---in terms of fused chains of basic operators---are ubiquitous, and include fewer materialized intermediates, fewer scans of inputs, and sparsity exploitation across operators. However, existing fusion heuristics struggle to find good plans for complex operator DAGs or hybrid plans of local and distributed operations. In this paper, we introduce an exact yet practical cost-based optimization framework for fusion plans and describe its end-to-end integration into Apache SystemML. We present techniques for candidate exploration and selection of fusion plans, as well as code generation of local and distributed operations over dense, sparse, and compressed data. Our experiments in SystemML show end-to-end performance improvements of up to 22x, with negligible compilation overhead. |
Databáze: | OpenAIRE |
Externí odkaz: |