Tensor Layout Optimization of Convolution for Inference on Digital Signal Processor

Autor: Xiaobin Zhang, Xiaoyang Zhang, Guangming Tan, Tian Zhongbo, Junmin Xiao, Zhu Hongrui, Hu Zhongzhe
Rok vydání: 2019
Předmět:
Zdroj: ISPA/BDCloud/SocialCom/SustainCom
DOI: 10.1109/ispa-bdcloud-sustaincom-socialcom48970.2019.00036
Popis: The development of artificial intelligence and 5G technology has driven Internet of Things into the era of Artificial Intelligence of Things (AIoT). As a part of AIoT, Digital Signal Processor (DSP) plays an important role in many AI chips. It is able to perform quickly in multiply-accumulate calculations. Unlike GPUs which usually apply the general matrix multiply for convolution calculations, DSP uses a direct convolution calculating way. Therefore, most optimization techniques on GPUs are not fit for DSP. In many AIoT scenes, the input and output of convolution layers are usually irregular, which affects the efficiency of data storage. In order to make full use of DSP, this work analyzes the performance effects of different tensor layouts. Based on our analysis, a hybrid tensor layout optimization approach is proposed. Furthermore, the test results check that the accuracy of the empirical criterion for the layout selection is not high. Hence, an automatic tuning approach is designed to accurately choose the suitable layouts for different convolution layers in a neural network. Numerical experiments show that for the convolution calculation of each layer in MobileNet, the hybrid tensor layout optimization approach could achieve 11x speed-up at most compared with the single layout schedules. Furthermore, the accuracy of the automatic approach can reach 99.27% for the ordinary convolution, and the convolution calculation of a whole network under the tensor layouts chosen by the automatic approach achieves a good performance which is 72.8% better than that based on the common layout way.
Databáze: OpenAIRE