Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library

Autor: Hiroyuki Ootomo, Rio Yokota
Rok vydání: 2023
Zdroj: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region.
DOI: 10.1145/3578178.3578238
Databáze: OpenAIRE