A novel pipeline framework for multi oriented scene text image detection and recognition

Autor: Vahid Ghods, Fatemeh Naiemi, Hassan Khalesi
Rok vydání: 2021
Předmět:
Zdroj: Expert Systems with Applications. 170:114549
ISSN: 0957-4174
Popis: Automatic text detection and recognition (end-to-end text recognition) in real-life images are the main elements of many applications including blind and low vision assistance systems and self-driving cars. However, it is challenging to detect curved and vertical texts due to their color bleeding, font size variation, and complicated background. In this paper, a convolutional neural network-based pipeline is introduced to obtain high-level visual features and improve text detection and recognition efficiency. A pre-trained ResNet-50 network on ImageNet and SynthText for extracting low-level visual features was used in this study. Moreover, new improved ReLU layer (new.i.ReLU) blocks are used with a varied receptive field with a strong ability to detect text components even on curved surfaces in the proposed structure. A new improved inception layer (new.i.inception layers) can obtain broadly varying-sized text more effectively than a linear chain of convolution layer. Also, we have proposed a pipeline framework for character recognition that is robust to irregular (curve and vertical) text. First, we introduced a novel algorithm for encoding pixel’s value to a new one called local word directional pattern (LWDP) that highlights the texture of the characters. Then, the output of LWDP was presented as an input image in the text recognition process. The experiments on standard benchmarks, including ICDAR 2013, ICDAR 2015, and ICDAR 2019 datasets, illustrated the superiority of the proposed architecture over prior works.
Databáze: OpenAIRE