Identification of Matra Region and Overlapping Characters for OCR of Printed Bengali Scripts
Autor: | Subhra Sundar Goswami |
---|---|
Rok vydání: | 2011 |
Předmět: |
Character (computing)
Computer science business.industry ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Headline Pattern recognition Optical character recognition computer.software_genre language.human_language Identification (information) Bengali Scripting language language Segmentation Artificial intelligence business computer |
Zdroj: | Communications in Computer and Information Science ISBN: 9783642181337 |
DOI: | 10.1007/978-3-642-18134-4_96 |
Popis: | One of the important reasons for poor recognition rate in optical character recognition (OCR) system is the error in character segmentation. In case of Bangla scripts, the errors occur due to several reasons, which include incorrect detection of matra (headline), over-segmentation and under-segmentation. We have proposed a robust method for detecting the headline region. Existence of overlapping characters (in under-segmented parts) in scanned printed documents is a major problem in designing an effective character segmentation procedure for OCR systems. In this paper, a predictive algorithm is developed for effectively identifying overlapping characters and then selecting the cut-borders for segmentation. Our method can be successfully used in achieving high recognition result. |
Databáze: | OpenAIRE |
Externí odkaz: |