Script-agnostic reflow of text in document images
Autor: | Abhinav Uppal, Saurabh Panjwani, Edward Cutrell |
---|---|
Rok vydání: | 2011 |
Předmět: |
Hindi
Multimedia Computer science Arabic business.industry media_common.quotation_subject Usability computer.software_genre language.human_language Kannada Scripting language Reading (process) ComputingMethodologies_DOCUMENTANDTEXTPROCESSING language Segmentation Artificial intelligence business computer Mobile device Natural language processing media_common |
Zdroj: | Mobile HCI |
DOI: | 10.1145/2037373.2037419 |
Popis: | Reading text from document images can be difficult on mobile devices due to the limited screen width available on them. While there exist solutions for reflowing Latin-script texts on such devices, these solutions do not work well for images of other scripts or combinations of scripts, since they rely on script-specific characteristics or OCR. We present a technique that reflows text in document images in a manner that is agnostic to the script used to compose them. Our technique achieved over 95% segmentation accuracy for a corpus of 139 images containing text in 4 genetically-distant languages-English, Hindi, Kannada and Arabic. A preliminary user study with a prototype implementation of the technique provided evidence of some of its usability benefits. |
Databáze: | OpenAIRE |
Externí odkaz: |