Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library
Autor: | Hiroyuki Ootomo, Rio Yokota |
---|---|
Rok vydání: | 2023 |
Zdroj: | Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region. |
DOI: | 10.1145/3578178.3578238 |
Databáze: | OpenAIRE |
Externí odkaz: |