Towards an automatic generation of dense linear algebra solvers on parallel architectures

Autor:	Baboulin, Marc, Falcou, Joel, Masliah, Ian
Přispěvatelé:	Performance Optimization by Software Transformation and Algorithms & Librairies Enhancement (POSTALE), Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Université Paris-Sud, INRIA
Jazyk:	angličtina
Rok vydání:	2014
Předmět:	Numerical libraries dense linear systems active libraries Computer Science::Mathematical Software generative programming mixed precision algorithms GPU computing [INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC] [INFO.INFO-NA]Computer Science [cs]/Numerical Analysis [cs.NA]
Zdroj:	[Research Report] RR-8615, Université Paris-Sud; INRIA. 2014, pp.20
Popis:	The increasing complexity of new parallel architectures has widened the gap between adaptability and efficiency of the codes. As high performance numerical libraries tend to focus more on performance, we wish to address this issue using a C++ library called NT2. By analyzing the properties of the linear algebra domain that can be extracted from numerical libraries and combining them with architectural features, we developed a generic approach to solve dense linear systems on various architectures including CPU and GPU. We have then extended our work with an example of a least squares solver based on semi-normal equations in mixed precision that cannot be found in current libraries. For the automatically generated solvers, we report performance comparison with state-of-the-art codes, showing that it is possible to obtain a generic code with a high-level interface (similar to Matlab) that can run either on CPU or GPU and that does not generate significant overhead.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::afffe43f688c1094d32a2db6300e9ca4 https://hal.inria.fr/hal-01075663/file/RR-8615.pdf Zobrazit plný text záznamu