Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
Autor: | Yong Zhong, Jiachen Dang, Dongping Cao |
---|---|
Rok vydání: | 2021 |
Předmět: |
Physics and Astronomy (miscellaneous)
Machine vision Computer science General Mathematics scene text detection 02 engineering and technology Convolutional neural network convolutional neural networks 0202 electrical engineering electronic engineering information engineering Computer Science (miscellaneous) Pyramid (image processing) Pixel multioriented text business.industry lcsh:Mathematics Pattern recognition lcsh:QA1-939 021001 nanoscience & nanotechnology Real image Object detection Chemistry (miscellaneous) Feature (computer vision) Benchmark (computing) 020201 artificial intelligence & image processing Artificial intelligence 0210 nano-technology business |
Zdroj: | Symmetry Volume 13 Issue 3 Symmetry, Vol 13, Iss 486, p 486 (2021) |
ISSN: | 2073-8994 |
DOI: | 10.3390/sym13030486 |
Popis: | Scene text detection, this task of detecting text from real images, is a hot research topic in the machine vision community. Most of the current research is based on an anchor box. These methods are complex in model design and time-consuming to train. In this paper, we propose a new Fully Convolutional One-Stage Object Detection (FCOS)-based text detection method that can robustly detect multioriented and multilingual text from natural scene images in a per pixel prediction approach. Our proposed text detector employs an anchor-free approach, unlike state-of-the-art text detectors that do not rely on a predefined anchor box. In order to enhance the feature representation ability of FCOS for text detection tasks, we apply the Bidirectional Feature Pyramid Network (BiFPN) as the backbone network, enhancing the model learning capacity and increasing the receptive field. We demonstrate the superior performance of our method on multioriented (ICDAR-2015, ICDAR-2017 MLT) and horizontal (ICDAR-2013) text detection benchmark tasks. Moreover, our method has an f-measure of 88.65 and 86.32 for the benchmark datasets ICDAR 2013 and ICDAR 2015, respectively, and 80.75 for the ICDAR-2017 MLT dataset. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |