Word-Based Adaptive OCR for Historical Books

Autor: Eugene Walach, Vladimir Kluzner, Yuval Shimony, Asaf Tzadok, Apostolos Antonacopoulos
Rok vydání: 2009
Předmět:
Zdroj: ICDAR
DOI: 10.1109/icdar.2009.133
Popis: The aim of this work is to propose a new approach to the recognition of historical texts by providing an adaptive mechanism that automatically tunes itself to a specific book. The system is based on clustering together all the similar words in a book/text and simultaneously handling entire class. The paper describes the architecture of such a system and new algorithms that have been developed for robust word image comparison (including registration, optical flow based distortion compensation, and adaptive binarization). Results for a large dataset are presented as well. Over 23% recognition improvement is demonstrated.
Databáze: OpenAIRE