Identification of Matra Region and Overlapping Characters for OCR of Printed Bengali Scripts

Autor: Subhra Sundar Goswami
Rok vydání: 2011
Předmět:
Zdroj: Communications in Computer and Information Science ISBN: 9783642181337
DOI: 10.1007/978-3-642-18134-4_96
Popis: One of the important reasons for poor recognition rate in optical character recognition (OCR) system is the error in character segmentation. In case of Bangla scripts, the errors occur due to several reasons, which include incorrect detection of matra (headline), over-segmentation and under-segmentation. We have proposed a robust method for detecting the headline region. Existence of overlapping characters (in under-segmented parts) in scanned printed documents is a major problem in designing an effective character segmentation procedure for OCR systems. In this paper, a predictive algorithm is developed for effectively identifying overlapping characters and then selecting the cut-borders for segmentation. Our method can be successfully used in achieving high recognition result.
Databáze: OpenAIRE