Thermal-Aware Scheduling for Deep Learning on Mobile Devices With NPU

Autor: Tan, Tianxiang, Cao, Guohong
Zdroj: IEEE Transactions on Mobile Computing; December 2024, Vol. 23 Issue: 12 p10706-10719, 14p
Abstrakt: As Deep Neural Networks (DNNs) have been successfully applied to various fields, there is a tremendous demand for running DNNs on mobile devices. Although mobile GPU can be leveraged to improve performance, it consumes a large amount of energy. After a short period of time, the mobile device may become overheated and the processors are forced to reduce the clock speed, significantly reducing the processing speed. A different approach to support DNNs on mobile device is to leverage the Neural Processing Units (NPUs). Compared to GPU, NPU is much faster and more energy efficient, but with lower accuracy due to the use of low precision floating-point numbers. We propose to combine these two approaches to improve the performance of running DNNs on mobile devices by studying the thermal-aware scheduling problem, where the goal is to achieve a better tradeoff between processing time and accuracy while ensuring that the mobile device is not overheated. To solve the problem, we propose a heuristic-based scheduling algorithm to determine when to run DNNs on GPU and when to run DNNs on NPU based on the current states of the mobile device. The heuristic-based algorithm makes scheduling decisions greedily and ignores their future impacts. Thus, we propose a deep reinforcement learning based scheduling algorithm to further improve performance. Extensive evaluation results show that the proposed algorithms can significantly improve the performance of running DNNs on mobile devices while avoiding overheating.
Databáze: Supplemental Index