Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Peltekis, Christodoulos"'
The widespread adoption of machine learning algorithms necessitates hardware acceleration to ensure efficient performance. This acceleration relies on custom matrix engines that operate on full or reduced-precision floating-point arithmetic. However,
Externí odkaz:
http://arxiv.org/abs/2408.11997
Structured sparsity is an efficient way to prune the complexity of modern Machine Learning (ML) applications and to simplify the handling of sparse data in hardware. In such cases, the acceleration of structured-sparse ML models is handled by sparse
Externí odkaz:
http://arxiv.org/abs/2402.10850
Transformers have improved drastically the performance of natural language processing (NLP) and computer vision applications. The computation of transformers involves matrix multiplications and non-linear activation functions such as softmax and GELU
Externí odkaz:
http://arxiv.org/abs/2402.10118
Autor:
Peltekis, Christodoulos, Filippas, Dionysios, Dimitrakopoulos, Giorgos, Nicopoulos, Chrysostomos
Publikováno v:
In Microprocessors and Microsystems October 2023 102
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Peltekis, Christodoulos, Filippas, Dionysios, Dimitrakopoulos, Giorgos, Nicopoulos, Chrysostomos
Systolic Array (SA) architectures are well suited for accelerating matrix multiplications through the use of a pipelined array of Processing Elements (PEs) communicating with local connections and pre-orchestrated data movements. Even though most of
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::374b359255d32a835bd44432e103a10b
http://arxiv.org/abs/2304.12691
http://arxiv.org/abs/2304.12691
Autor:
Filippas, Dionysios, Peltekis, Christodoulos, Dimitrakopoulos, Giorgos, Nicopoulos, Chrysostomos
The acceleration of deep-learning kernels in hardware relies on matrix multiplications that are executed efficiently on Systolic Arrays (SA). To effectively trade off deep-learning training/inference quality with hardware cost, SA accelerators employ
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4dc5968feccbe477fd010207e7b4f8c0
http://arxiv.org/abs/2304.01668
http://arxiv.org/abs/2304.01668