Dominant Color Segmentation of Administrative Document Images by Hierarchical Clustering
Autor: | Jean-Christophe Burie, Jean-Marc Ogier, Elodie Carel, Vincent Courboulay |
---|---|
Přispěvatelé: | Courboulay, Vincent, Laboratoire Informatique, Image et Interaction - EA 2118 (L3I), Université de La Rochelle (ULR) |
Jazyk: | angličtina |
Rok vydání: | 2013 |
Předmět: |
Computer science
Process (engineering) 020206 networking & telecommunications Context (language use) 02 engineering and technology computer.software_genre Hierarchical clustering Workflow [INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] [INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ComputingMethodologies_DOCUMENTANDTEXTPROCESSING 0202 electrical engineering electronic engineering information engineering A priori and a posteriori 020201 artificial intelligence & image processing Segmentation Relevance (information retrieval) Data mining Cluster analysis computer ComputingMilieux_MISCELLANEOUS |
Zdroj: | ACM Symposium on Document Engineering 13th ACM Symposium on Document Engineering (DocEng) 13th ACM Symposium on Document Engineering (DocEng), Sep 2013, Florence, Italy |
Popis: | This paper addresses the problem of color documents images segmentation in an industrial context. Automated Document Recognition (ADR) systems highly reduce time and resource costs of companies by managing their huge amount of administrative documents, and by optimizing their workflow. Most of the time, a binarization is performed due to their historical industrial process. Therefore, colorimetric information can improve the process. In this paper, we propose a hierarchical clustering based approach to extract dominant color masks of documents. Indeed, our dataset comprises different kind of scanned administrative document images such as invoices, forms, letters, and so on. We do not know a priori the number of dominant colors on our documents. These masks will further feed the inputs to an OCR in order to bring extra-information about the colorimetric context. This approach requires neither user interaction nor setting steps. Experiments on several types of documents show the relevance of the proposed approach |
Databáze: | OpenAIRE |
Externí odkaz: |