CNN Inference: Dynamic and Predictive Quantization

Autor:	Manu Mathew, Pramod Kumar Swami, Kumar Desappan, Mihir Mody, Praveen Eppa
Rok vydání:	2018
Předmět:	Floating point Contextual image classification Computer science business.industry Quantization (signal processing) Deep learning 05 social sciences Inference Kalman filter 010501 environmental sciences 01 natural sciences Convolutional neural network 0502 economics and business Segmentation Artificial intelligence 050207 economics business Algorithm 0105 earth and related environmental sciences
Zdroj:	ICCE-Berlin
DOI:	10.1109/icce-berlin.2018.8576251
Popis:	Deep Learning techniques like Convolutional Neural Networks (CNN) are the de-facto method for image classification with broad usage spanning across automotive, industrial, medicine, robotics etc. Efficient implementation of CNN inference on embedded device requires a quantization method, which minimizes the accuracy loss, ability to generalize across deployment scenarios as well as real-time processing. Existing literature doesn’t address all these three requirements simultaneously. In this paper, we propose a novel quantization algorithm to overcome above mentioned challenges. The proposed solution dynamically selects the scale for quantizing activations and uses Kalman filter to predict quantization scale to reduce accuracy loss. The proposed solution exploits the range statistics from previous inference processes to estimate quantization scale, enabling real-time solution. The proposed solution is implemented on TI’s TDA family of embedded automotive processors. The proposed solution is running real time semantic segmentation on TDA2x processor within 0.1% accuracy loss compared floating point algorithm. The solution performs well across multiple deployment scenarios (e.g. rain, snow, night etc) demonstrating generalization capability of the solution.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::411b429bed74b1eebe7a773285071d74 https://doi.org/10.1109/icce-berlin.2018.8576251 Zobrazit plný text záznamu