Preprocessing of Document Images for Automatic Character Recognition

Autor: Yi-Kai Chen, 陳怡凱
Rok vydání: 2000
Druh dokumentu: 學位論文 ; thesis
Popis: 89
Automatic document image processing by computer is one of the important fields in computer science. Many algorithms are proposed to character or word recognition. Also, a good work of preprocessing will make the recognition more efficient and robust. There are several subparts of preprocessing: noise reduction, skew detection and reconstruction, and segmentation… etc. In this dissertation, we provide some new and effective algorithms in preprocessing of document images. First, we describe a robust skew detection and reconstruction algorithm. Skew of both monochrome and color-printed document images can be efficiently and effectively detected and reconstructed. Then, we propose a new algorithm using foreground and background analysis to separate single- or multiple-touching handwritten numeral strings. Finally, we propose a robust preprocessing approach in allusion to locate the destination address block on the complex advertising mail. In the first part of this dissertation, we describe a new algorithm to detect the skew angle of a monochrome or color-printed document image and reconstruct it. Our approach first computes variation of monochrome- or color-transition count at each angle (from -450 to +450) from monochrome or color-printed document images respectively and the angle of maximal variation is regarded as the skew angle. Then, a scanning-line model is used to reconstruct the image. We test 103 monochrome document images and 100 color-printed document images of various kinds and get good results (5 fail in monochrome document images and 7 fail in color-printed document images). The average success rate is 94.1%. On a Pentium III 733 PC, the average processing time of an A4 size monochrome document image is 0.67 seconds and the reconstruction time is 0.96 seconds, and the average processing time of an A4 size color-printed document image is 2.76 seconds and the reconstruction time is 3.97 seconds. In the second part of this dissertation, we propose a new segmentation algorithm to separate the single- or multiple-touching handwritten numeral strings. Thinning of both foreground and background regions are first processed on the image of connected numeral strings and the feature points on foreground and background skeletons are then extracted. Several possible segmentation paths are then constructed and useless strokes are removed. Finally, the parameters of geometric properties of each possible segmentation paths are determined and these parameters are analyzed by mixture Gaussian probability function to decide the best segmentation path or reject it. Experimental results on NIST special database 19 (an update of NIST special database 3) and some other images collected by ourselves (total 4500 test images) show that our algorithm can get a correct rate of 96%. In the last part of this dissertation, we propose a reliable method of locating the destination address on complex advertising mails. Our approach is started from a new method named “rising-falling-region extraction and binarization” followed by connected component searching, filtering, analysis, and the modified region-growing approach. Then, the weight functions derived from our observation on complex advertising mails are used to identify which region is the destination address. The success of our algorithm for identifying destination address on complex advertising mails is demonstrated on 40 test images with 92.5% success rate. The results also show that our approach is invariant to skew, hand-written, and machine-printed address. The average processing time is 1.83 sec per case on PentiumIII-733 PC.
Databáze: Networked Digital Library of Theses & Dissertations