Hybrid Transformer-EfficientNet Model for Robust Human Activity Recognition: The BiTransAct Approach

Autor: Aftab Ul Nabi, Jinglun Shi, Kamlesh, Awais Khan Jumani, Jameel Ahmed Bhutto
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: IEEE Access, Vol 12, Pp 184517-184528 (2024)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2024.3506598
Popis: Human Activity Recognition (HAR) has been employed in a number of applications including sports analytic, healthcare monitoring, surveillance, and human-computer interaction. Despite a decade of research on HAR, existing models still find it challenging under conditions like occlusion, computational efficiency, and capturing long-term temporal dependencies. To address these shortcomings, we present BiTransAct, a novel hybrid model that incorporates EfficientNet-B0 for spatial features extraction as well as Transformer Encoder to obtain the temporal relationships in video data. To evaluate the performance of our proposed model we have employed a video based dataset called SPHAR-Dataset-1.0. This dataset contains 7,759 videos with 14 diverse activities and 421,441 samples. From our experiments its established that BiTransAct consistently excels other deep learning based models like SWIN, EfficientNet, and RegNet in terms of both classification accuracy and precision. Its efficiency in handling large datasets without compromising on performance makes it stronger candidate for real-time HAR tasks. Furthermore, the features like self-attention mechanism and dynamic learning rate make BiTransAct even more robust and avoid overfitting. The results demonstrate that BiTransAct provides a scalable, efficient solution for HAR applications, with particular relevance for real-world scenarios such as video surveillance and healthcare monitoring.
Databáze: Directory of Open Access Journals