Intelligent Forms Processing
Autor: | R. G. Casey, D. R. Ferguson |
---|---|
Rok vydání: | 1990 |
Předmět: |
General Computer Science
Matching (graph theory) Computer science business.industry Reading (computer) Process (computing) Data field Optical character recognition computer.software_genre Computer Graphics and Computer-Aided Design Blank Theoretical Computer Science Forms processing Computational Theory and Mathematics Component (UML) Computer vision Artificial intelligence business computer Software Information Systems |
Zdroj: | IBM Systems Journal. 29:435-450 |
ISSN: | 0018-8670 |
DOI: | 10.1147/sj.293.0435 |
Popis: | The automatic reading of optically scanned forms consists of two major components: extraction of the data image from the form and interpretation of the image as coded alphanumerics. The second component is also known as optical character recognition, or OCR. We have implemented a method for entry of a wide variety of forms that contain machine-printed data and that are often produced in business environments. The function, called Intelligent Forms Processing (IFP), accepts conventional forms that call for information to be printed in designated blank areas, but in which the information may exceed boundaries due to poor registration during printing. The human eye easily accommodates data that impinge on form boundaries or on background text; however, the same powers of discrimination applied to machine processing pose a technical challenge. The IFP system uses a setup phase to create a model of each form that is to be read. Scanned forms containing data are compared against the matching form model. Special algorithms are employed to extract data fields while removing background printing (e.g., form lines) intersecting the data. The extracted data images are interpreted by an OCR process that reads typical monospace fonts. New fonts may be added easily in a separate design mode. If the data are alphabetic, a lexicon may be assembled to define the possible entries. |
Databáze: | OpenAIRE |
Externí odkaz: |