Minimizing User Annotations in the Generation of Layout Ground-Truthed Data
Autor: | Karim Hadjar, Rolf Ingold |
---|---|
Rok vydání: | 2011 |
Předmět: |
Ground truth
Artificial neural network Contextual image classification computer.internet_protocol Arabic business.industry Computer science Computer Science::Information Retrieval Knowledge engineering computer.software_genre GeneralLiterature_MISCELLANEOUS language.human_language Text mining ComputingMethodologies_DOCUMENTANDTEXTPROCESSING language Data mining business Adaptation (computer science) computer XML |
Zdroj: | ICDAR |
DOI: | 10.1109/icdar.2011.147 |
Popis: | This paper describes the adaptation of a previously developed document recognition framework called PLANET (Physical Layout Analysis of complex structured Arabic documents using artificial neural NETs) into a ground truthing system for complex Arabic document images [8]. PLANET is a layout analysis tool for Arabic documents with complex structures allowing incremental learning in an interactive environment. Artificial neural nets drive the classification of homogeneous text blocks. We have observed that when users use PLANET for ground truthing, the number of interactive corrections is quite large. In order to reduce user intervention and to make use of PLANET as a ground truthing system we have adapted its architecture. |
Databáze: | OpenAIRE |
Externí odkaz: |