Autor: |
Zhang, Luoming, He, Yefei, Lou, Zhenyu, Ye, Xin, Wang, Yuxing, Zhou, Hong |
Předmět: |
|
Zdroj: |
Applied Intelligence; Mar2023, Vol. 53 Issue 6, p6266-6275, 10p |
Abstrakt: |
Low precision deep neural network model quantization can further reveal stronger abilities of models such as shorter inference time, lower energy consumption and memory usage, but meanwhile induce performance degradation and instability during training. Straight Through Estimator (STE) is widely used in Quantization-Aware-Training (QAT) to overcome these shortcomings, and achieves good results on (2-, 3-, 4-bit) quantization. Different STE function may achieve different performance under various quantization precision settings. In order to explore the applicable bit-width settings range of STE functions and stabilize the training process, we propose Root Quantization. Root Quantization combines two estimators, the linear estimator and the root estimator. While linear estimator is based on existing methods of training quantizer and weights under task loss function, root estimator is based on high degree root and acts as a correction module to fine-tune the weights, which not only approximates the gradient of quantization error, but also makes the gradient more accurate. Root estimator can also adapt and adjust each layer's root degree to the most suitable value through the task loss gradient. Extensive experimental results on CIFAR-10 and ImageNet, with different network architectures under various bit-width range, show the effectiveness of our method. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|