Zobrazeno 1 - 10
of 66
pro vyhledávání: '"Alexander Heinecke"'
Autor:
Evangelos Georganas, Dhiraj Kalamkar, Sasikanth Avancha, Menachem Adelman, Deepti Aggarwal, Cristina Anderson, Alexander Breuer, Jeremy Bruestle, Narendra Chaudhary, Abhisek Kundu, Denise Kutnick, Frank Laub, Vasimuddin Md, Sanchit Misra, Ramanarayan Mohanty, Hans Pabst, Brian Retford, Barukh Ziv, Alexander Heinecke
Publikováno v:
Frontiers in Applied Mathematics and Statistics, Vol 8 (2022)
During the past decade, novel Deep Learning (DL) algorithms, workloads and hardware have been developed to tackle a wide range of problems. Despite the advances in workload and hardware ecosystems, the programming methodology of DL systems is stagnan
Externí odkaz:
https://doaj.org/article/0ca258dcbf0f493b97c2c2aef3829f43
Autor:
Rui Ma, Evangelos Georganas, Alexander Heinecke, Sergey Gribok, Andrew Boutros, Eriko Nurvitadhi
Publikováno v:
IEEE Computer Architecture Letters. 21:49-52
Autor:
Narendra Chaudhary, Sanchit Misra, Dhiraj Kalamkar, Alexander Heinecke, Evangelos Georganas, Barukh Ziv, Menachem Adelman, Bharat Kaul
Publikováno v:
2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
Autor:
Alexander Heinecke, Evangelos Georganas, Nesreen K. Ahmed, Dhiraj D. Kalamkar, Vasimuddin, Sasikanth Avancha, Sanchit Misra, Guixiang Ma, Ramanarayan Mohanty
Publikováno v:
SC
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible. It is challenging due to large memory capacity and bandwidth requirements
Autor:
Menachem Adelman, Narendra Chaudhary, Dhiraj D. Kalamkar, Barukh Ziv, Bharat Kaul, Sanchit Misra, Alexander Heinecke, Evangelos Georganas
Identifying accessible chromatin regions is a fundamental problem in epigenomics with ATAC-seq being a commonly used assay. Exponential rise in single cell ATAC-seq experiments has made it critical to accelerate processing of ATAC-seq data. ATAC-seq
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::0a2881cf74d8f1fb3a0dc7ee5bae7313
https://doi.org/10.1101/2021.09.28.462099
https://doi.org/10.1101/2021.09.28.462099
Autor:
Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Mikhail Shiryaev, Jianping Chen, Dhiraj D. Kalamkar
Publikováno v:
SC
During the last two years, the goal of many researchers has been to squeeze the last bit of performance out of HPC system for AI tasks. Often this discussion is held in the context of how fast ResNet50 can be trained. Unfortunately, ResNet50 is no lo
Autor:
Kunal Banerjee, Dhiraj D. Kalamkar, Anand Venkat, Alexander Heinecke, Evangelos Georganas, Sasikanth Avancha, Hans Pabst, Greg Henry, Michael J. Anderson
Publikováno v:
IPDPS
Deep learning (DL) is one of the most prominent branches of machine learning. Due to the immense computational cost of DL workloads, industry and academia have developed DL libraries with highly-specialized kernels for each workload/architecture, lea
Autor:
Kunal Banerjee, Alexander Heinecke, Evangelos Georganas, Dhiraj D. Kalamkar, Barukh Ziv, Cristina S. Anderson, Eden Segal
Publikováno v:
Supercomputing Frontiers and Innovations. 6
Recurrent neural network (RNN) models have been found to be well suited for processing temporal data. In this work, we present an optimized implementation of vanilla RNN cell and its two popular variants: LSTM and GRU for Intel Xeon architecture. Typ
Autor:
Srinivas Sridharan, Kunal Banerjee, Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Mikhail E. Smorkalov, Dhiraj D. Kalamkar, Cong Xu
Publikováno v:
CLUSTER
Google’s neural machine translation (GNMT) is state-of-the-art recurrent neural network (RNN/LSTM) based language translation application. It is computationally more demanding than well-studied convolutional neural networks (CNNs). Also, in contras
Publikováno v:
ARITH
In recent years fused-multiply-add (FMA) units with lower-precision multiplications and higher-precision accumulation have proven useful in machine learning/artificial intelligence applications, most notably in training deep neural networks due to th