Autotuning and Specialization: Speeding up Matrix Multiply for Small Matrices with Compiler Technology.

Autor: Shin, Jaewook, Hall, Mary W., Chame, Jacqueline, Chen, Chun, Hovland, Paul D.
Zdroj: Software Automatic Tuning; 2010, p353-370, 18p
Abstrakt: Autotuning technology has emerged recently as a systematic process for evaluating alternative implementations of a computation to select the best-performing solution for a particular architecture. Specialization optimizes code customized to a particular class of input data. This paper presents a compiler optimization approach that combines novel autotuning compiler technology with specialization for expected data set sizes of key computations, focused on matrix multiplication of small matrices. We describe compiler techniques developed for this approach, including the interface to a polyhedral transformation system for generating specialized code and the heuristics used to prune the enormous search space of alternative implementations. We demonstrate significantly better performance than directly using libraries such as GOTO, ATLAS, and ACML BLAS that are not specifically optimized for the problem sizes on hand. In a case study of Nek5000, a spectral-element-based code that extensively uses the specialized matrix multiply, we demonstrate a performance improvement of 36% for the full application. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index