Translation-Inspired OCR
Autor: | Andrew W. Senior, Frank Yung-Fong Tang, Ashok C. Popat, Eugene Ie, Nemanja Spasojevic, Dmitriy Genzel, Michael Edward Jahr |
---|---|
Rok vydání: | 2011 |
Předmět: |
Machine translation
Computer science business.industry Speech recognition Word error rate Optical character recognition Transfer-based machine translation computer.software_genre Example-based machine translation Text mining Rule-based machine translation Cache language model Artificial intelligence Language model Computational linguistics business computer Natural language processing |
Zdroj: | ICDAR |
DOI: | 10.1109/icdar.2011.269 |
Popis: | Optical character recognition is carried out using techniques borrowed from statistical machine translation. In particular, the use of multiple simple feature functions in linear combination, along with minimum-error-rate training, integrated decoding, and $N$-gram language modeling is found to be remarkably effective, across several scripts and languages. Results are presented using both synthetic and real data in five languages. |
Databáze: | OpenAIRE |
Externí odkaz: |