Translation-Inspired OCR

Autor: Andrew W. Senior, Frank Yung-Fong Tang, Ashok C. Popat, Eugene Ie, Nemanja Spasojevic, Dmitriy Genzel, Michael Edward Jahr
Rok vydání: 2011
Předmět:
Zdroj: ICDAR
DOI: 10.1109/icdar.2011.269
Popis: Optical character recognition is carried out using techniques borrowed from statistical machine translation. In particular, the use of multiple simple feature functions in linear combination, along with minimum-error-rate training, integrated decoding, and $N$-gram language modeling is found to be remarkably effective, across several scripts and languages. Results are presented using both synthetic and real data in five languages.
Databáze: OpenAIRE