Abstrakt: |
Existent GPU simulators are too slow to use for neural networks implemented in GPUs. For fast performance estimation, we propose a novel hybrid method of analytical performance modeling and sampled simulation of GPUs. By taking full advantage of repeated computation of neural networks, three sampling techniques are devised: Inter-Kernel sampling, Intra-Kernel sampling, and Streaming Multiprocessor sampling. The key technique is to estimate the average IPC through sampled simulation, considering the effect of the warp scheduler and memory access contention. Compared with GPGPU-Sim, the proposed technique reduces the simulation time by up to 450 times with less than 5.0% of accuracy loss. [ABSTRACT FROM AUTHOR] |