Rethinking Generalization in American Sign Language Prediction for Edge Devices with Extremely Low Memory Footprint
Autor: | Stuti Sehgal, Aditya Jyoti Paul, Puranjay Mohan |
---|---|
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning J.3 American Sign Language Edge device Computer science Computer Science - Artificial Intelligence Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition Computer Science - Human-Computer Interaction I.2.10 I.4.8 I.5.1 I.4.1 K.4.2 Inference 02 engineering and technology Machine Learning (cs.LG) Human-Computer Interaction (cs.HC) 0202 electrical engineering electronic engineering information engineering Quantization (image processing) Computer Science - Computation and Language 020208 electrical & electronic engineering Intelligent decision support system 68T45 68T10 68T07 68U10 021001 nanoscience & nanotechnology language.human_language Artificial Intelligence (cs.AI) Computer engineering language Memory footprint 0210 nano-technology Computation and Language (cs.CL) |
DOI: | 10.48550/arxiv.2011.13741 |
Popis: | Due to the boom in technical compute in the last few years, the world has seen massive advances in artificially intelligent systems solving diverse real-world problems. But a major roadblock in the ubiquitous acceptance of these models is their enormous computational complexity and memory footprint. Hence efficient architectures and training techniques are required for deployment on extremely low resource inference endpoints. This paper proposes an architecture for detection of alphabets in American Sign Language on an ARM Cortex-M7 microcontroller having just 496 KB of framebuffer RAM. Leveraging parameter quantization is a common technique that might cause varying drops in test accuracy. This paper proposes using interpolation as augmentation amongst other techniques as an efficient method of reducing this drop, which also helps the model generalize well to previously unseen noisy data. The proposed model is about 185 KB post-quantization and inference speed is 20 frames per second. Comment: 6 pages, Published in IEEE RAICS 2020, see https://raics.in |
Databáze: | OpenAIRE |
Externí odkaz: |