Improved localization accuracy by LocNet for Faster R-CNN based text detection in natural scene images
Autor: | Zhuoyao Zhong, Qiang Huo, Lei Sun |
---|---|
Rok vydání: | 2019 |
Předmět: |
business.industry
Computer science Pattern recognition 02 engineering and technology Text detection 01 natural sciences Column (database) Artificial Intelligence Minimum bounding box Simple (abstract algebra) 0103 physical sciences Signal Processing 0202 electrical engineering electronic engineering information engineering Benchmark (computing) 020201 artificial intelligence & image processing Computer Vision and Pattern Recognition Artificial intelligence 010306 general physics business Software |
Zdroj: | Pattern Recognition. 96:106986 |
ISSN: | 0031-3203 |
Popis: | Although Faster R-CNN based text detection approaches have achieved promising results, their localization accuracy is not satisfactory in certain cases due to their sub-optimal bounding box regression based localization modules. In this paper, we address this problem and propose replacing the bounding box regression module with a novel LocNet based localization module to improve the localization accuracy of a Faster R-CNN based text detector. Given a proposal generated by a region proposal network (RPN), instead of directly predicting the bounding box coordinates of the concerned text instance, the proposal is enlarged to create a search region so that an “In-Out” conditional probability to each row and column of this search region is assigned, which can then be used to accurately infer the concerned bounding box. Furthermore, we present a simple yet effective two-stage approach to convert the difficult multi-oriented text detection problem to a relatively easier horizontal text detection problem, which makes our approach able to robustly detect multi-oriented text instances with accurate bounding box localization. Experiments demonstrate that the proposed approach boosts the localization accuracy of Faster R-CNN based text detectors significantly. Consequently, our new text detector has achieved superior performance on both horizontal (ICDAR-2011, ICDAR-2013 and MULTILIGUL) and multi-oriented (MSRA-TD500, ICDAR-2015) text detection benchmark tasks. |
Databáze: | OpenAIRE |
Externí odkaz: |