Autor: |
Jahangir F; Department of Computer Science, HITEC University, Taxila 47080, Pakistan., Khan MA; Department of Computer Science, HITEC University, Taxila 47080, Pakistan.; Department of Informatics, University of Leicester, Leicester LE1 7RH, UK., Alhaisoni M; College of Computer Science and Engineering, University of Ha'il, Ha'il 81451, Saudi Arabia., Alqahtani A; Department of Software Engineering, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-kharj 16242, Saudi Arabia., Alsubai S; Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-kharj 16242, Saudi Arabia., Sha M; Department of Software Engineering, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-kharj 16242, Saudi Arabia., Al Hejaili A; Faculty of Computers & Information Technology, Computer Science Department, University of Tabuk, Tabuk 71491, Saudi Arabia., Cha JH; Department of computer Science, Hanyang University, Seoul 04763, Republic of Korea. |
Abstrakt: |
The performance of human gait recognition (HGR) is affected by the partial obstruction of the human body caused by the limited field of view in video surveillance. The traditional method required the bounding box to recognize human gait in the video sequences accurately; however, it is a challenging and time-consuming approach. Due to important applications, such as biometrics and video surveillance, HGR has improved performance over the last half-decade. Based on the literature, the challenging covariant factors that degrade gait recognition performance include walking while wearing a coat or carrying a bag. This paper proposed a new two-stream deep learning framework for human gait recognition. The first step proposed a contrast enhancement technique based on the local and global filters information fusion. The high-boost operation is finally applied to highlight the human region in a video frame. Data augmentation is performed in the second step to increase the dimension of the preprocessed dataset (CASIA-B). In the third step, two pre-trained deep learning models-MobilenetV2 and ShuffleNet-are fine-tuned and trained on the augmented dataset using deep transfer learning. Features are extracted from the global average pooling layer instead of the fully connected layer. In the fourth step, extracted features of both streams are fused using a serial-based approach and further refined in the fifth step by using an improved equilibrium state optimization-controlled Newton-Raphson (ESOcNR) selection method. The selected features are finally classified using machine learning algorithms for the final classification accuracy. The experimental process was conducted on 8 angles of the CASIA-B dataset and obtained an accuracy of 97.3, 98.6, 97.7, 96.5, 92.9, 93.7, 94.7, and 91.2%, respectively. Comparisons were conducted with state-of-the-art (SOTA) techniques, and showed improved accuracy and reduced computational time. |