Deep Learning Approaches for Nusantara Scripts Optical Character Recognition

Autor: Agi Prasetiadi, Julian Saputra, Iqsyahiro Kresna, Imada Ramadhanti
Jazyk: English<br />Indonesian
Rok vydání: 2023
Předmět:
Zdroj: IJCCS (Indonesian Journal of Computing and Cybernetics Systems), Vol 17, Iss 3, Pp 325-336 (2023)
Druh dokumentu: article
ISSN: 1978-1520
2460-7258
DOI: 10.22146/ijccs.86302
Popis: The number of speakers of regional languages who are able to read and to write traditional scripts in Indonesia is decreasing. If left unaddressed, this will lead to the extinction of Nusantara scripts and it is not impossible that their reading methods will be forgotten in the future. To anticipate this, this study aims to preserve the knowledge of reading ancient scripts by developing a Deep Learning model that can read document images written using one of the 10 Nusantara scripts we have collected: Bali, Batak, Bugis, Javanese, Kawi, Kerinci, Lampung, Pallava, Rejang, and Sundanese. While previous studies have made efforts to read traditional Nusantara scripts using various Machine Learning and Convolutional Neural Network algorithms, they have primarily focused on specific scripts and lacked an integrated approach from script type recognition to character recognition. This study is the first to comprehensively address the entire range of Nusantara scripts, encompassing script type detection and character recognition. Convolutional Neural Network, ConvMixer, and Visual Transformer models were utilized and their respective performances were compared. The results demonstrate that our models achieved 96% accuracy in classifying Nusantara script types, with character recognition accuracy ranging from 93% to approximately 100% across the ten scripts.
Databáze: Directory of Open Access Journals