Robust table recognition for printed document images
Autor: | Jian Zhong Peng, Qiao Kang Liang, Yao Nan Wang, Dan Zhang, Zhengwei Li, Da Qi Xie, Wei Sun |
---|---|
Rok vydání: | 2020 |
Předmět: |
Computer science
character recognition ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Image processing 02 engineering and technology binarization algorithm Robustness (computer science) 0502 economics and business 0202 electrical engineering electronic engineering information engineering Recognition system table image recognition QA1-939 Preprocessor business.industry Applied Mathematics Deep learning 05 social sciences deep learning Pattern recognition General Medicine Computational Mathematics Recurrent neural network Modeling and Simulation 020201 artificial intelligence & image processing recurrent neural network Artificial intelligence Mutual exclusion Chinese characters General Agricultural and Biological Sciences business 050203 business & management TP248.13-248.65 Mathematics Biotechnology |
Zdroj: | Mathematical Biosciences and Engineering, Vol 17, Iss 4, Pp 3203-3223 (2020) |
ISSN: | 1551-0018 |
Popis: | The recognition and analysis of tables on printed document images is a popular research field of the pattern recognition and image processing. Existing table recognition methods usually require high degree of regularity, and the robustness still needs significant improvement. This paper focuses on a robust table recognition system that mainly consists of three parts: Image preprocessing, cell location based on contour mutual exclusion, and recognition of printed Chinese characters based on deep learning network. A table recognition app has been developed based on these proposed algorithms, which can transform the captured images to editable text in real time. The effectiveness of the table recognition app has been verified by testing a dataset of 105 images. The corresponding test results show that it could well identify high-quality tables, and the recognition rate of low-quality tables with distortion and blur reaches 81%, which is considerably higher than those of the existing methods. The work in this paper could give insights into the application of the table recognition and analysis algorithms. |
Databáze: | OpenAIRE |
Externí odkaz: |