ESDAR-net: towards high-accuracy and real-time driver action recognition for embedded systems.
Autor: | Hu, Yaocong, Shuai, Zhen, Yang, Huicheng, Wan, Guoyang, Zhang, Yajun, Xie, Chao, Lu, Mingqi, Lu, Xiaobo |
---|---|
Zdroj: | Multimedia Tools & Applications; Feb2024, Vol. 83 Issue 6, p18281-18307, 27p |
Abstrakt: | Existing driver action recognition approaches suffer from a bottleneck problem which is the trade-off between recognition accuracy and computational efficiency. More specifically, the high-capacity spatial-temporal deep learning model is unable to realize real-time driver action recognition on vehicle-mounted device. To overcome such limitation, this paper puts forward a novel driver action recognition solution suitable for embedded systems. The proposed ESDAR-Net is a multi-branch deep learning framework and directly processes compressed videos. To reduce the computational cost, a lightweight 2D/3D convolutional network is employed for spatial-temporal modeling. Moreover, two strategies are implemented to boost the accuracy performance: (1) cross-layer connection module (CLCM) and spatial-temporal trilinear pooling module (STTPM) are designed to adaptively fuse appearance and motion information; (2) complementary knowledge from the high-capacity spatial-temporal deep learning model is distilled and transferred to the proposed ESDAR-Net. Experimental results show that the proposed ESDAR-Net satisfies both high-accuracy and real-time for driver action recognition. The accuracy on SEU-DAR-V1, SEU-DAR-V2 reaches 98.7%, 96.5%, with learnable parameters of 2.19M, FLOPs of 0.253G, and speed of 27 clips/s on JETSON TX2. [ABSTRACT FROM AUTHOR] |
Databáze: | Complementary Index |
Externí odkaz: |
načítá se...