Robust table recognition for printed document images

Autor: Jian Zhong Peng, Qiao Kang Liang, Yao Nan Wang, Dan Zhang, Zhengwei Li, Da Qi Xie, Wei Sun
Rok vydání: 2020
Předmět:
Computer science
character recognition
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
Image processing
02 engineering and technology
binarization algorithm
Robustness (computer science)
0502 economics and business
0202 electrical engineering
electronic engineering
information engineering

Recognition system
table image recognition
QA1-939
Preprocessor
business.industry
Applied Mathematics
Deep learning
05 social sciences
deep learning
Pattern recognition
General Medicine
Computational Mathematics
Recurrent neural network
Modeling and Simulation
020201 artificial intelligence & image processing
recurrent neural network
Artificial intelligence
Mutual exclusion
Chinese characters
General Agricultural and Biological Sciences
business
050203 business & management
TP248.13-248.65
Mathematics
Biotechnology
Zdroj: Mathematical Biosciences and Engineering, Vol 17, Iss 4, Pp 3203-3223 (2020)
ISSN: 1551-0018
Popis: The recognition and analysis of tables on printed document images is a popular research field of the pattern recognition and image processing. Existing table recognition methods usually require high degree of regularity, and the robustness still needs significant improvement. This paper focuses on a robust table recognition system that mainly consists of three parts: Image preprocessing, cell location based on contour mutual exclusion, and recognition of printed Chinese characters based on deep learning network. A table recognition app has been developed based on these proposed algorithms, which can transform the captured images to editable text in real time. The effectiveness of the table recognition app has been verified by testing a dataset of 105 images. The corresponding test results show that it could well identify high-quality tables, and the recognition rate of low-quality tables with distortion and blur reaches 81%, which is considerably higher than those of the existing methods. The work in this paper could give insights into the application of the table recognition and analysis algorithms.
Databáze: OpenAIRE