Learning text-line localization with shared and local regression neural networks

Autor: Christian Wolf, Christopher Kermorvant, Bastien Moysset, Jérôme Louradour
Přispěvatelé: Extraction de Caractéristiques et Identification (imagine), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Université Lumière - Lyon 2 (UL2), A2iA (A2iA), A2iA, Wolf, Christian
Jazyk: angličtina
Rok vydání: 2016
Předmět:
Zdroj: International Conference on Frontiers in Handwriting Recognition
International Conference on Frontiers in Handwriting Recognition, Oct 2016, Shenzhen, China
ICFHR
Popis: International audience; Text line detection and localisation is a crucial step for full page document analysis, but still suffers from heterogeneity of real life documents. In this paper, we present a novel approach for text line localisation based on Convolutional Neural Networks and Multidimensional Long Short-Term Memory cells as a regressor in order to predict the coordinates of the text line bounding boxes directly from the pixel values. Targeting typically large images in document image analysis, we propose a new model using weight sharing over local blocks. We compare two strategies: directly predicting the four coordinates or predicting lower-left and upper-right points separately followed by matching. We evaluate our work on the highly unconstrained Maurdor dataset and show that our method outperforms both other machine learning and image processing methods.
Databáze: OpenAIRE