Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Czaja, Jacek"'
In this paper We present a methodology for creating Roofline models automatically for Non-Unified Memory Access (NUMA) using Intel Xeon as an example. Finally, we present an evaluation of highly efficient deep learning primitives as implemented in th
Externí odkaz:
http://arxiv.org/abs/2009.11224
Softmax is popular normalization method used in machine learning. Deep learning solutions like Transformer or BERT use the softmax function intensively, so it is worthwhile to optimize its performance. This article presents our methodology of optimiz
Externí odkaz:
http://arxiv.org/abs/1904.12380