Improving Urdu Recognition Using Character-Based Artistic Features of Nastalique Calligraphy

Autor: Qurat Ul Ain Akram, Sarmad Hussain
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Zdroj: IEEE Access, Vol 7, Pp 8495-8507 (2019)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2018.2887103
Popis: The state-of-the-art Urdu recognition approaches for Nastalique use features along with the sequence of characters’ labels for classification and recognition. In Arabic-like cursive script, the characters are joined together to form a ligature. The conventional methods process the connected stroke of ligatures as a sequence of characters. However, connected stroke of a ligature image has a sequence of pairs of characters and their joiners, instead of a sequence of characters. The character has a distinctive shape that clearly distinguishes it from other characters. The joiner preserves the connecting stroke shape of a character with the next character. In this paper, an implicit Urdu character recognition technique is presented for the Nastalique writing style that is based on recognition of characters and joiners. The detailed analysis of the Nastalique calligraphy is carried out to extract the artistic features of characters and their joiners. The presented technique is tested on Dataset-1 of 1446 ligature classes covering 3309762 ligature instances and 91129 unique Urdu words. In addition, the system is also tested on 1600 text lines of UPTI dataset called Dataset-2. The character recognition accuracies are 95.58% and 98.37% on Dataset-1 and Dataset-2, respectively. The results reveal that the system outperforms the state-of-the-art hidden Markov models and deep learning-based Urdu recognition techniques.
Databáze: Directory of Open Access Journals