Text graphic separation in Indian newspapers

Autor: Anukriti Bansal, Santanu Chaudhury, Ritu Garg, Sumantra Dutta Roy
Rok vydání: 2013
Předmět:
Zdroj: MOCR@ICDAR
DOI: 10.1145/2505377.2505393
Popis: Digitization of newspaper article is important for registering historical events. Layout analysis of Indian newspaper is a challenging task due to the presence of different font size, font styles and random placement of text and non-text regions. In this paper we propose a novel framework for learning optimal parameters for text graphic separation in the presence of complex layouts. The learning problem has been formulated as an optimization problem using EM algorithm to learn optimal parameters depending on the nature of the document content.
Databáze: OpenAIRE