Low complexity multi-directional in-air ultrasonic gesture recognition using a TCN

Autor: Ibrahim, Emad, Geilen, Marc C.W., Huisken, Jos A., Li, Min, Pineda de Gyvez, Jose, Di Natale, Giorgio, Bolchini, Cristiana, Vatajelu, Elena-Ioana
Přispěvatelé: Electronic Systems, Cyber-Physical Systems Center Eindhoven, Systemic Change, Model-Based Design Lab, CompSOC Lab- Predictable & Composable Embedded Systems
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020, 1259-1264
STARTPAGE=1259;ENDPAGE=1264;TITLE=Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020
DATE
Popis: On the trend of ultrasound-based gesture recognition, this study introduces the concept of time-sequence classification of ultrasonic patterns induced by hand movements on amicrophone array. We refer to time-sequence ultrasound echoesas continuous frequency patterns being received in real-time atdifferent steering angles. The ultrasound source is a single tonecontinuously being emitted from the center of the microphonearray. In the interim, the array beamforms and locates anultrasonic activity (induced echoes) after which a processingpipeline is initiated to extract band-limited frequency features.These beamformed features are organized in a 2D matrix of size11 × 30 updated every 10ms on which a Temporal ConvolutionalNetwork (TCN) outputs continuous classification. Prior to that,the same TCN is trained to classify Doppler shift variabilityrate. Using this approach, we show that a user can easily achieve49 gestures at different steering angles by means of sequencedetection. To make it simple to users, we define two Dopplershift variability rates; very slow and very fast which the TCNdetects 95-99% of the time. Not only a gesture can be performedat different directions but also the length of each performedgesture can be measured. This leverages the diversity of inair ultrasonic gestures allowing more control capabilities. Theprocess is designed under low-resource settings; that is, giventhe fact that this real-time process is always-on, the power andmemory resources should be optimized. The proposed solutionneeds 6:2 − 10:2 MMACs and a memory footprint of 6KBallowing such gesture recognition system to be hosted by energyconstrained edge devices such as smart-speakers.
Databáze: OpenAIRE