Metaprogramming dense linear algebra solvers. Applications to multi and many-core architectures

Autor: Masliah, Ian, Baboulin, Marc, Falcou, Joel
Přispěvatelé: Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Performance Optimization by Software Transformation and Algorithms & Librairies Enhancement (POSTALE), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Baboulin, Marc
Jazyk: angličtina
Rok vydání: 2015
Předmět:
Zdroj: 13th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2015)
13th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2015), Aug 2015, Helsinki, Finland
Popis: International audience; The increasing complexity of new parallel architectures has widened the gap between adaptability and efficiency of the codes. As high performance numerical libraries tend to focus more on performance, we wish to address this issue using a C++ library called NT2. By analyzing the properties of the linear algebra domain that can be extracted from numerical libraries and combining them with architectural features, we developed a generic approach to solve dense linear systems on various architectures including CPU and GPU. We have then extended our work with an example of a least squares solver based on semi-normal equations in mixed precision that cannot be found in current libraries. For the automatically generated solvers, we report performance comparisons with state-of-the-art codes, and show that it is possible to obtain a generic code with a high-level interface (similar to MATLAB) which runs either on CPU or GPU without generating a significant overhead.
Databáze: OpenAIRE