Popis: |
Due to the impact of COVID-19, online physical education (PE) teaching has garnered increasing attention. Given the characteristics of online PE teaching, introducing artificial intelligence technology to automatically detect or recognize students’ actions or behaviors has gradually emerged as a trend. However, traditional cloud computing-based intelligent online PE teaching systems often face various challenging issues, such as computational complexity and latency. Edge computing can address these problems. However, edge devices typically have limited computing power, while existing deep action recognition models often contain a large number of parameters and require significant computational resources, making them difficult to deploy on edge devices. To address the above issues, this paper proposes a lightweight video recognition method, named the lightweight video ViT (LWV-ViT) network. More specifically, based on the standard ViT model, the video-based ViT (VBViT) network is first introduced by developing a cross-temporal token interaction module to effectively process temporal information in videos. Furthermore, the LWV-ViT network is proposed by implementing a spatial-temporal pruning scheme to reduce the number of parameters. Finally, the proposed LWV-ViT network is deployed in an edge computing-based online PE teaching system, where it is installed on each edge device. This setup enables fast data processing, reduces transmission latency, and protects sensitive data. Experimental results show that the proposed LWV-ViT network achieves the best recognition rates for both behavior detection (96.5%, 95.73%) and action recognition (97.9%, 88.3%, 79.9%) tasks, and has the fewest trainable parameters (2.7 M), which means it performs well in edge computing-based online PE teaching systems. |