SCATTER: Selective Context Attentional Scene Text Recognizer
Autor: | Shahar Tsiper, R. Manmatha, Oron Anschel, Roee Litman, Ron Litman, Shai Mazor |
---|---|
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
Computer science business.industry Speech recognition Computer Vision and Pattern Recognition (cs.CV) Feature extraction Computer Science - Computer Vision and Pattern Recognition Context (language use) 02 engineering and technology 010501 environmental sciences 01 natural sciences Visualization 0202 electrical engineering electronic engineering information engineering Task analysis 020201 artificial intelligence & image processing Artificial intelligence business Encoder 0105 earth and related environmental sciences Block (data storage) |
Zdroj: | CVPR |
DOI: | 10.48550/arxiv.2003.11288 |
Popis: | Scene Text Recognition (STR), the task of recognizing text against complex image backgrounds, is an active area of research. Current state-of-the-art (SOTA) methods still struggle to recognize text written in arbitrary shapes. In this paper, we introduce a novel architecture for STR, named Selective Context ATtentional Text Recognizer (SCATTER). SCATTER utilizes a stacked block architecture with intermediate supervision during training, that paves the way to successfully train a deep BiLSTM encoder, thus improving the encoding of contextual dependencies. Decoding is done using a two-step 1D attention mechanism. The first attention step re-weights visual features from a CNN backbone together with contextual features computed by a BiLSTM layer. The second attention step, similar to previous papers, treats the features as a sequence and attends to the intra-sequence relationships. Experiments show that the proposed approach surpasses SOTA performance on irregular text recognition benchmarks by 3.7\% on average. Comment: In CVPR 2020 |
Databáze: | OpenAIRE |
Externí odkaz: |