Application of daisy descriptor for language identification in the wild

Autor: Ayatullah Faruk Mollah, Agneet Chatterjee, Pawan Kumar Singh, Neelotpal Chakraborty, Ram Sarkar
Rok vydání: 2020
Předmět:
Zdroj: Multimedia Tools and Applications. 80:323-344
ISSN: 1573-7721
1380-7501
DOI: 10.1007/s11042-020-09728-2
Popis: Recent years have witnessed significant development in the field of text detection in natural scene images. However, issues like poor image quality and complex background reduce the efficiency of such methods, thereby requiring a good pre-processing module for image enhancement. Also, conventional texture-based features have some limitations for classifying text and non-text components due to potential similarities between them. To this end, a new model is proposed where the image quality is first enhanced by removing noise and blur. Then, a histogram-based adaptive K-means clustering of intensity values is performed in order to extract the text candidates. These candidates are then analyzed using Daisy descriptor for text/non-text determination, and language identification of the text. The proposed model is applied on an in-house multi-lingual dataset of images with texts in Indian languages, and on standard datasets including ICDAR 2017, MLe2e and KAIST. The results indicate significant improvement in performance compared to some contemporary methods.
Databáze: OpenAIRE