Segmentation of Text and Graphics/Images Using the Gray-Level Histogram Fourier Transform

Autor: Miguel A. Patricio, Darío Maravall Gómez-Allende
Rok vydání: 2000
Předmět:
Zdroj: Advances in Pattern Recognition ISBN: 9783540679462
SSPR/SPR
DOI: 10.1007/3-540-44522-6_78
Popis: One crucial issue in automatic document analysis is the discrimination between text and graphics/images. This paper presents a novel, robust method for the segmentation of text and graphics/images in digitized documents. This method is based on the representation of window-like portions of a document by means of their gray level histograms. Through empirical evidence it is shown that text and graphics/images regions have different gray level histograms. Unlike the usual approach for the characterization of histograms that is based on statistics parameters a novel approach is introduced. This approach works with the histogram Fourier transform since it possesses all the information contained in the histogram pattern. The next and logical step is to automatically select the most discriminant spectral components as far as the text and graphics/images segmentation goal is concerned. A fully automated procedure for the optimal selection of the discriminant features is also expounded. Finally, empirical results obtained for the text and graphics/images segmentation using a simple three-layer perceptron-like neural network are also discussed.
Databáze: OpenAIRE