Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks
Autor: | Mélodie Boillet, Christopher Kermorvant, Thierry Paquet |
---|---|
Rok vydání: | 2021 |
Předmět: |
FOS: Computer and information sciences
Artificial neural network Computer science business.industry Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Training (meteorology) 020207 software engineering Pattern recognition 02 engineering and technology Image segmentation Task (project management) Line (geometry) 0202 electrical engineering electronic engineering information engineering Deep neural networks 020201 artificial intelligence & image processing Segmentation Artificial intelligence business Document layout analysis |
Zdroj: | ICPR |
DOI: | 10.1109/icpr48806.2021.9412447 |
Popis: | In this paper, we introduce a fully convolutional network for the document layout analysis task. While state-of-the-art methods are using models pre-trained on natural scene images, our method Doc-UFCN relies on a U-shaped model trained from scratch for detecting objects from historical documents. We consider the line segmentation task and more generally the layout analysis problem as a pixel-wise classification task then our model outputs a pixel-labeling of the input images. We show that Doc-UFCN outperforms state-of-the-art methods on various datasets and also demonstrate that the pre-trained parts on natural scene images are not required to reach good results. In addition, we show that pre-training on multiple document datasets can improve the performances. We evaluate the models using various metrics to have a fair and complete comparison between the methods. |
Databáze: | OpenAIRE |
Externí odkaz: |