Zobrazeno 1 - 10
of 12
pro vyhledávání: '"Andrés E. Tomás"'
Publikováno v:
Repositori Universitat Jaume I
Universitat Jaume I
The International Journal of High Performance Computing Applications
Universitat Jaume I
The International Journal of High Performance Computing Applications
Krylov methods provide a fast and highly parallel numerical tool for the iterative solution of many large-scale sparse linear systems. To a large extent, the performance of practical realizations of these methods is constrained by the communication b
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8536cdc4fe665b6856a826d6c80505ee
http://hdl.handle.net/10234/200483
http://hdl.handle.net/10234/200483
Publikováno v:
Repositori Universitat Jaume I
Universitat Jaume I
Universitat Jaume I
Tuning and optimising the operations executed in deep learning frameworks is a fundamental task in accelerating the processing of deep neural networks (DNNs). However, this optimisation usually requires extensive manual efforts in order to obtain the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b1de51e45b08ba63594e6733b224b0ba
http://hdl.handle.net/10234/198110
http://hdl.handle.net/10234/198110
Autor:
Andrés E. Tomás, Enrique S. Quintana-Ortí, José Ignacio Aliaga, Yuhsiang M. Tsai, Hartwig Anzt
Publikováno v:
Repositori Universitat Jaume I
Universitat Jaume I
Euro-Par 2020: Parallel Processing Workshops ISBN: 9783030715922
Euro-Par Workshops
Euro-Par 2020: Parallel Processing Workshops
Universitat Jaume I
Euro-Par 2020: Parallel Processing Workshops ISBN: 9783030715922
Euro-Par Workshops
Euro-Par 2020: Parallel Processing Workshops
We contribute to the optimization of the sparse matrix-vector product on graphics processing units by introducing a variant of the coordinate sparse matrix layout that compresses the integer representation of the matrix indices. In addition, we emplo
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5fa13beadee9f0ee9769b564e813ada3
Autor:
Hartwig Anzt, Thomas Grützmacher, José Ignacio Aliaga, Andrés E. Tomás, Enrique S. Quintana-Ortí
Publikováno v:
Repositori Universitat Jaume I
Universitat Jaume I
Universitat Jaume I
We contribute to the optimization of the sparse matrix-vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the numerical information. Our a
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e62ad6aa304640fdddeea869b2c24c41
Autor:
Rocío Carratalá-Sáez, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí, Andrés E. Tomás, Sandra Catalán
Publikováno v:
Parallel Computing
We investigate the introduction of look-ahead in two-stage algorithms for the singular value decomposition (SVD). Our approach relies on a specialized reduction for the first stage that produces a band matrix with the same upper and lower bandwidth i
Autor:
Adrián Castelló, Sergio Barrachina, Manuel F. Dolz, Enrique S. Quintana-Ortí, Pau San Juan, Andrés E. Tomás
Publikováno v:
Repositori Universitat Jaume I
Universitat Jaume I
Universitat Jaume I
We evolve PyDTNN, a framework for distributed parallel training of Deep Neural Networks (DNNs), into an efficient inference tool for convolutional neural networks. Our optimization process on multicore ARM processors involves several high-level trans
Autor:
Goran Flegar, Andrés E. Tomás, A. Cristiano I. Malossi, Giovani Mariani, Enrique S. Quintana-Ortí, Florian Scheidegger, Vedran Novaković
Publikováno v:
RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
instname
instname
[EN] We present FloatX (Float eXtended), a C++ framework to investigate the effect of leveraging customized floating-point formats in numerical applications. FloatX formats are based on binary IEEE 754 with smaller significand and exponent bit counts
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7d35b586cd14d3d9e0af37e256f20a82
https://doi.org/10.1145/3368086
https://doi.org/10.1145/3368086
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783030293994
Euro-Par
Euro-Par 2019: Parallel Processing-25th International Conference on Parallel and Distributed Computing, Göttingen, Germany, August 26–30, 2019, Proceedings
Lecture Notes in Computer Science
Lecture Notes in Computer Science-Euro-Par 2019: Parallel Processing
Euro-Par
Euro-Par 2019: Parallel Processing-25th International Conference on Parallel and Distributed Computing, Göttingen, Germany, August 26–30, 2019, Proceedings
Lecture Notes in Computer Science
Lecture Notes in Computer Science-Euro-Par 2019: Parallel Processing
We present a method for the QR factorization of large tall-and-skinny matrices that combines block Gram-Schmidt and the Cholesky decomposition to factorize the input matrix column panels, overcoming the sequential nature of this operation. This metho
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f9ac035ab51640e3f0e8749ba31129b1
https://doi.org/10.1007/978-3-030-29400-7_33
https://doi.org/10.1007/978-3-030-29400-7_33
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783030024642
ISC Workshops
Lecture Notes in Computer Science
Lecture Notes in Computer Science-High Performance Computing
High Performance Computing-ISC High Performance 2018 International Workshops, Frankfurt/Main, Germany, June 28, 2018, Revised Selected Papers
ISC Workshops
Lecture Notes in Computer Science
Lecture Notes in Computer Science-High Performance Computing
High Performance Computing-ISC High Performance 2018 International Workshops, Frankfurt/Main, Germany, June 28, 2018, Revised Selected Papers
We investigate the solution of sparse linear systems via iterative methods based on Krylov subspaces. Concretely, we combine the use of extended precision in the outer iterative refinement with a reduced precision in the inner Conjugate Gradient solv
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::af2aa6d68f765b5ee7d5a988713502b0
https://doi.org/10.1007/978-3-030-02465-9_39
https://doi.org/10.1007/978-3-030-02465-9_39
Publikováno v:
RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia
instname
Lecture Notes in Computer Science ISBN: 9783642387173
VECPAR
instname
Lecture Notes in Computer Science ISBN: 9783642387173
VECPAR
The QR decomposition with column pivoting (QRP) of a matrix is widely used for rank revealing. The performance of LAPACK implementation (DGEQP3) of the Householder QRP algorithm is limited by Level 2 BLAS operations required for updating the column n
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::be1ce793cad9ad9abde25d1b6258dcfc
https://doi.org/10.1007/978-3-642-38718-0_8
https://doi.org/10.1007/978-3-642-38718-0_8