Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network
Autor: | Hassan El Bahi, Abdelkarim Zatni |
---|---|
Rok vydání: | 2019 |
Předmět: |
Computer Networks and Communications
Computer science business.industry Frame (networking) 020207 software engineering Pattern recognition 02 engineering and technology Convolutional neural network Recurrent neural network Connectionism Hardware and Architecture Sliding window protocol 0202 electrical engineering electronic engineering information engineering Media Technology Artificial intelligence Line (text file) business Software Block (data storage) |
Zdroj: | Multimedia Tools and Applications. 78:26453-26481 |
ISSN: | 1573-7721 1380-7501 |
DOI: | 10.1007/s11042-019-07855-z |
Popis: | Automatic text recognition in document images is an important task in many real-world applications. Several systems have been proposed to accomplish this task. However, a little attention has been given to document images obtained by mobile phones. To meet this need, we propose a new system that integrates preprocessing, features extraction and classification in order to recognize text contained in the document images acquired by a smartphone. The preprocessing phase is applied to locate the text region, and then segment that region into text line images. In the second phase, a sliding window divides the text-line image into a sequence of frames; afterwards a deep convolutional neural network (CNN) model is used to extract features from each frame. Finally, an architecture that combines the bidirectional recurrent neural network (RNN), the gated recurrent units (GRU) block and the connectionist temporal classification (CTC) layer is explored to ensure the classification phase. The proposed system has been tested on the ICDAR2015 Smartphone document OCR dataset and the experimental results show that the proposed system is capable to achieve promising recognition rates. |
Databáze: | OpenAIRE |
Externí odkaz: |