Character recognition in the presence of occluding clutter

Autor:	Knut T. Fosseide, Lars Aurdal
Rok vydání:	2009
Předmět:	Computer science business.industry Clutter Computer vision Artificial intelligence Optical character recognition business computer.software_genre computer Character recognition
Zdroj:	DRR
ISSN:	0277-786X
DOI:	10.1117/12.805855
Popis:	Many documents contain (free-hand) underlining, "COPY" stamps, crossed out text, doodling and other "clutter" that occlude the text. In many cases, it is not possible to separate the text from the clutter. Commercial OCR solutions typically fail for cluttered text. We present a new method for finding the clutter using path analysis of points on the skeleton of the clutter/text connected component. This method can separate the clutter from the text even for fairly complex clutter shapes. Even with good localization of occluding clutter, it is difficult to use feature-based recognition for occluded characters, simply because the clutter affects the features in various ways. We propose a new algorithm that uses adapted templates of the font in the document that can be used for all forms of occlusion of the character. The method finds the simulated localization of the corresponding clutter in the templates and compares the unaffected parts of the templates and the character. The method has proved highly successful even when much of the character is occluded. We present examples of clutter localization and character recognition with occluded characters.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::cf06043f4a2a830ed68419d6d8040e19 https://doi.org/10.1117/12.805855 Zobrazit plný text záznamu