PHTI-WS: A Printed and Handwritten Text Identification Web Service Based on FCN and CRF Post-Processing
Autor: | Nicolas Dutly, Rolf Ingold, Fouad Slimane |
---|---|
Rok vydání: | 2019 |
Předmět: |
Conditional random field
Measure (data warehouse) Computer science business.industry ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION 020207 software engineering 02 engineering and technology Image segmentation 010501 environmental sciences computer.software_genre 01 natural sciences Task (project management) Identification (information) ComputingMethodologies_DOCUMENTANDTEXTPROCESSING 0202 electrical engineering electronic engineering information engineering Artificial intelligence Web service business computer Natural language processing 0105 earth and related environmental sciences |
Zdroj: | OST@ICDAR |
DOI: | 10.1109/icdarw.2019.10033 |
Popis: | This paper introduces a lightweight model for printed and handwritten text identification in document images, which is then deployed as a web service to enable easy integration into existing workflows. Identifying printed and handwritten text in documents containing multiple handwritten annotations, which partially overlap machine printed text, is a challenging task. The lack of existence of a dataset containing pixel-level annotations for this task explains why the many papers tackling this problem employ word-or line-level classification, which cannot effectively solve the task mentioned above. In this work, we use a newly created dataset containing pixel-level annotations to train a lightweight, fully convolutional network, which we combine with a conditional random field for postprocessing. We measure the performance of our method on the aforementioned dataset and compare it to results achieved by the U-net fully convolutional architecture. Initial test results indicate a 90% mean IoU, which is a 5% improvement when comparing to the results produced by the U-net model after postprocessing. Hence, our contribution is two-fold. First, we introduce a model which combines a lightweight fully convolutional architecture with conditional random field postprocessing to solve the task of printed and handwritten text recognition on a pixel-level, and secondly, we describe how our model is deployed as a web service. |
Databáze: | OpenAIRE |
Externí odkaz: |