Zobrazeno 1 - 10
of 227
pro vyhledávání: '"Endo, Toshio"'
Autor:
Zhang, Lingqi, Wahib, Mohamed, Chen, Peng, Meng, Jintao, Wang, Xiao, Endo, Toshio, Matsuoka, Satoshi
General Purpose Graphics Processing Units (GPGPU) are used in most of the top systems in HPC. The total capacity of scratchpad memory has increased by more than 40 times in the last decade. However, existing optimizations for stencil computations usi
Externí odkaz:
http://arxiv.org/abs/2306.03336
Autor:
Zhang, Lingqi, Wahib, Mohamed, Chen, Peng, Meng, Jintao, Wang, Xiao, Endo, Toshio, Matsuoka, Satoshi
Iterative stencils are used widely across the spectrum of High Performance Computing (HPC) applications. Many efforts have been put into optimizing stencil GPU kernels, given the prevalence of GPU-accelerated supercomputers. To improve the data local
Externí odkaz:
http://arxiv.org/abs/2305.07390
Autor:
Moses, William S., Ivanov, Ivan R., Domke, Jens, Endo, Toshio, Doerfert, Johannes, Zinenko, Oleksandr
While parallelism remains the main source of performance, architectural implementations and programming models change with each new hardware generation, often leading to costly application re-engineering. Most tools for performance portability requir
Externí odkaz:
http://arxiv.org/abs/2207.00257
Autor:
Zhang, Lingqi, Wahib, Mohamed, Chen, Peng, Meng, Jintao, Wang, Xiao, Endo, Toshio, Matsuoka, Satoshi
Iterative memory-bound solvers commonly occur in HPC codes. Typical GPU implementations have a loop on the host side that invokes the GPU kernel as much as time/algorithm steps there are. The termination of each kernel implicitly acts the barrier req
Externí odkaz:
http://arxiv.org/abs/2204.02064
Autor:
Suzumura, Toyotaro, Sugiki, Akiyoshi, Takizawa, Hiroyuki, Imakura, Akira, Nakamura, Hiroshi, Taura, Kenjiro, Kudoh, Tomohiro, Hanawa, Toshihiro, Sekiya, Yuji, Kobayashi, Hiroki, Matsushima, Shin, Kuga, Yohei, Nakamura, Ryo, Jiang, Renhe, Kawase, Junya, Hanai, Masatoshi, Miyazaki, Hiroshi, Ishizaki, Tsutomu, Shimotoku, Daisuke, Miyamoto, Daisuke, Aida, Kento, Takefusa, Atsuko, Kurimoto, Takashi, Sasayama, Koji, Kitagawa, Naoya, Fujiwara, Ikki, Tanimura, Yusuke, Aoki, Takayuki, Endo, Toshio, Ohshima, Satoshi, Fukazawa, Keiichiro, Date, Susumu, Uchibayashi, Toshihiro
The growing amount of data and advances in data science have created a need for a new kind of cloud platform that provides users with flexibility, strong security, and the ability to couple with supercomputers and edge devices through high-performanc
Externí odkaz:
http://arxiv.org/abs/2203.14188
Stencil computation is one of the most widely-used compute patterns in high performance computing applications. Spatial and temporal blocking have been proposed to overcome the memory-bound nature of this type of computation by moving memory pressure
Externí odkaz:
http://arxiv.org/abs/2001.01473
Autor:
Ito, Yuki, Imai, Haruki, Duc, Tung Le, Negishi, Yasushi, Kawachiya, Kiyokuni, Matsumiya, Ryo, Endo, Toshio
GPUs are widely used to accelerate deep learning with NNs (NNs). On the other hand, since GPU memory capacity is limited, it is difficult to implement efficient programs that compute large NNs on GPU. To compute NNs exceeding GPU memory capacity, dat
Externí odkaz:
http://arxiv.org/abs/1907.05013
Autor:
Moses, William S., Ivanov, Ivan R., Domke, Jens, Endo, Toshio, Doerfert, Johannes, Zinenko, Oleksandr
Publikováno v:
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming.
While parallelism remains the main source of performance, architectural implementations and programming models change with each new hardware generation, often leading to costly application re-engineering. Most tools for performance portability requir
Publikováno v:
情報処理学会研究報告. (No. 24)
Autor:
Nomura, Akihiro, Endo, Toshio
Publikováno v:
情報処理学会研究報告. (No. 14):1-8