Text Detection from Image and Video Frames Using Improved Golden Jackal Optimization based Shallow Convolutional Neural Network.

Autor: Chaitra, M., Roopa, M. J., Prakruthi, M. K., Chayadevi, M. L., Shree, M. Rajani, Swetha, M. D.
Předmět:
Zdroj: International Journal of Intelligent Engineering & Systems; 2024, Vol. 17 Issue 5, p65-75, 11p
Abstrakt: Text detection plays a significant role in reading the text content worldwide and it is difficult to localize due to the natural scene image texts being distributed. The lower resolution, text orientation, size, and writing style variations make text detection a challenging process. To overcome the above-defined problems, this article proposed a Cosine Similarity-Golden Jackal Optimization (CSGJO) based Shallow Convolutional Neural Network (CNN). The datasets utilized for estimating the proposed method are ICDAR2015, MSRA-TD500, and CTW1500 datasets. The data augmentation and normalization are utilized as preprocessing techniques which enhances the dataset size and removes null values. The preprocessed images are given to the feature extraction process using SE-ResNet152 which extracts the relevant features from the input. The extracted features are given to CSGJO which can consider the correlation among various features. After selecting the features, the shallow CNN is employed for detecting the text from input and video frames. The precision, recall, and f-measure are considered as metrics for assessing model performance. The CSGJO-based shallow CNN attained high precision of 92.57%, 93.69%, and 89.46% for ICDAR2015, MSRA-TD500 and CTW1500 datasets respectively. The proposed model performed better when compared to existing methods like Multi-Headed Self-Attention (MHSA), Graph Convolutional Networks (GCN), and Decoupled Feature Pyramid Networks (DFPN). [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index