Sign language recognition from digital videos using feature pyramid network with detection transformer

Autor:	Yu Liu, Parma Nand, Md Akbar Hossain, Minh Nguyen, Wei Qi Yan
Rok vydání:	2023
Předmět:	Computer Networks and Communications Hardware and Architecture Media Technology Software
Zdroj:	Multimedia Tools and Applications. 82:21673-21685
ISSN:	1573-7721 1380-7501
DOI:	10.1007/s11042-023-14646-0
Popis:	Sign language recognition is one of the fundamental ways to assist deaf people to communicate with others. An accurate vision-based sign language recognition system using deep learning is a fundamental goal for many researchers. Deep convolutional neural networks have been extensively considered in the last few years, and a slew of architectures have been proposed. Recently, Vision Transformer and other Transformers have shown apparent advantages in object recognition compared to traditional computer vision models such as Faster R-CNN, YOLO, SSD, and other deep learning models. In this paper, we propose a Vision Transformer-based sign language recognition method called DETR (Detection Transformer), aiming to improve the current state-of-the-art sign language recognition accuracy. The DETR method proposed in this paper is able to recognize sign language from digital videos with a high accuracy using a new deep learning model ResNet152 + FPN (i.e., Feature Pyramid Network), which is based on Detection Transformer. Our experiments show that the method has excellent potential for improving sign language recognition accuracy. For instance, our newly proposed net ResNet152 + FPN is able to enhance the detection accuracy up to 1.70% on the test dataset of sign language compared to the standard Detection Transformer models. Besides, an overall accuracy 96.45% was attained by using the proposed method.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::d9ce6acd8b97432819c031132a4eb3b2 https://doi.org/10.1007/s11042-023-14646-0 Zobrazit plný text záznamu Full text from SpringerLink