Zobrazeno 1 - 10
of 47
pro vyhledávání: '"Sven Hammarling"'
Autor:
Sven Hammarling, Nicholas J. Higham
Publikováno v:
Computing in Science & Engineering. 24:6-11
Autor:
Jakub Kurzak, Mark Gates, Nicholas J. Higham, Azzam Haidar, Jack Dongarra, Stanimire Tomov, Timothy B. Costa, Ahmad Abdelfattah, Mawussi Zounon, Sven Hammarling, Piotr Luszczek
Publikováno v:
ACM Transactions on Mathematical Software. 47:1-23
This article describes a standard API for a set of Batched Basic Linear Algebra Subprograms (Batched BLAS or BBLAS). The focus is on many independent BLAS operations on small matrices that are grouped together and processed by a single routine, calle
Autor:
Mark Gates, Panruo Wu, Maksims Abalenkovs, Azzam Haidar, David Stevens, Negin Bagherpour, Piotr Luszczek, Ichitaro Yamazaki, Jack Dongarra, Jakub Kurzak, Asim YarKhan, Samuel D. Relton, Mawussi Zounon, Jakub Šístek, Sven Hammarling
Publikováno v:
ACM Transactions on Mathematical Software. 45:1-35
The recent version of the Parallel Linear Algebra Software for Multicore Architectures (PLASMA) library is based on tasks with dependencies from the OpenMP standard. The main functionality of the library is presented. Extensive benchmarks are targete
Autor:
Jack Dongarra, Negin Bagherpour, Sven Hammarling, Jakub Šístek, David Stevens, Mawussi Zounon, Samuel D. Relton, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Panruo Wu, Ichitaro Yamazaki, Asim Yarkhan, Maksims Abalenkovs
Publikováno v:
ACM Transactions on Mathematical Software
Autor:
Nicholas J. Higham, Jack Dongarra, Samuel D. Relton, Sven Hammarling, Mawussi Zounon, Pedro Valero-Lara
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Dongarra, J, Hammarling, S, Higham, N, Relton, S, Valero-Lara, P & Zounon, M 2017, The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems . in Procedia Computer Science . vol. 108, pp. 495-504 . https://doi.org/10.1016/j.procs.2017.05.138
Recercat. Dipósit de la Recerca de Catalunya
instname
Procedia Computer Science
ICCS
Universitat Politècnica de Catalunya (UPC)
Dongarra, J, Hammarling, S, Higham, N, Relton, S, Valero-Lara, P & Zounon, M 2017, The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems . in Procedia Computer Science . vol. 108, pp. 495-504 . https://doi.org/10.1016/j.procs.2017.05.138
Recercat. Dipósit de la Recerca de Catalunya
instname
Procedia Computer Science
ICCS
A current trend in high-performance computing is to decompose a large linear algebra problem into batches containing thousands of smaller problems, that can be solved independently, before collating the results. To standardize the interface to these
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5b3492315b5a5b771f982cb2a225a088
http://hdl.handle.net/2117/106913
http://hdl.handle.net/2117/106913
Publikováno v:
Euro-Par 2017: Parallel Processing
Lecture Notes in Computer Science
Lecture Notes in Computer Science-Euro-Par 2017: Parallel Processing
Dongarra, J, Hammarling, S, Higham, N, Relton, S & Zounon, M 2017, Optimized Batched Linear Algebra for Modern Architectures . in F F Rivera, T F Pena & J C Cabaleiro (eds), Euro-Par 2017: Parallel Processing : 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28-September 1, 2017, Proceedings . Lecture notes in computer science, vol. 10417, Springer Nature, pp. 511-522 . https://doi.org/10.1007/978-3-319-64203-1_37
Lecture Notes in Computer Science ISBN: 9783319642024
Euro-Par
Lecture Notes in Computer Science
Lecture Notes in Computer Science-Euro-Par 2017: Parallel Processing
Dongarra, J, Hammarling, S, Higham, N, Relton, S & Zounon, M 2017, Optimized Batched Linear Algebra for Modern Architectures . in F F Rivera, T F Pena & J C Cabaleiro (eds), Euro-Par 2017: Parallel Processing : 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28-September 1, 2017, Proceedings . Lecture notes in computer science, vol. 10417, Springer Nature, pp. 511-522 . https://doi.org/10.1007/978-3-319-64203-1_37
Lecture Notes in Computer Science ISBN: 9783319642024
Euro-Par
Solving large numbers of small linear algebra problems simultaneously is becoming increasingly important in many application areas. Whilst many researchers have investigated the design of efficient batch linear algebra kernels for GPU architectures,
Publikováno v:
ACM Transactions on Mathematical Software. 39:1-19
We develop a new algorithm for the computation of all the eigenvalues and optionally the right and left eigenvectors of dense quadratic matrix polynomials. It incorporates scaling of the problem parameters prior to the computation of eigenvalues, a c
Publikováno v:
ACM Transactions on Mathematical Software. 34:1-33
On cache based computer architectures using current standard algorithms, Householder bidiagonalization requires a significant portion of the execution time for computing matrix singular values and vectors. In this paper we reorganize the sequence of
Publikováno v:
IEEE Control Systems. 24:60-76
The article "High-Performance Numerical Software for Control" concerns the development of quality software for control applications that perform efficiently and reliably on today'S modern computing machines. In particular, the subroutine library SLIC
Publikováno v:
Scopus-Elsevier
This paper considers key ideas in the design of out-of-core dense LU factorization routines. A left-looking variant of the LU factorization algorithm is shown to require less I/O to disk than the right-looking variant, and is used to develop a parall