Advancing Optical Character Recognition for Low-Resource Scripts: A Siamese Meta-Learning Approach With PSN Framework

Autor: Anirudha Ghosh, Debaditya Barman, Abu Sufian, Ibrahim A. Hameed
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: IEEE Access, Vol 12, Pp 189651-189666 (2024)
Druh dokumentu: article
ISSN: 2169-3536
08514453
DOI: 10.1109/ACCESS.2024.3509605
Popis: With the increasing demand for digitization, Optical Character Recognition (OCR) systems play a vital role in digitizing physical manuscripts. Several methods have been successfully deployed in the OCR domain. However, they often face challenges when dealing with low-resource regional scripts because of the limited training data and complex structure of characters. In such a scenario, Siamese Network (SN) meta-learning offers a promising solution for this problem by enabling quick adaptation to new tasks with minimal training data. Despite the success of SNs in various classification tasks, the traditional SN architecture seeks a compelling upgrade to improve its ability to distinguish between similar-looking characters of regional scripts. In this research paper, we propose a novel Priority-Smart Network (PSN) framework for traditional SN architectures, which can easily be incorporated into existing CNN backbone and improve their ability to identify characters in low-resource regional scripts. Furthermore, we propose the Enhanced Differential Edge Detection (EDED) preprocessing strategy explicitly designed for OCR tasks. We rigorously investigate and evaluate three benchmark low-resource script datasets to establish the effectiveness of our proposed techniques. Our experimentation results showcase significant advancements in character recognition accuracy and robustness, emphasizing the potential of SN combined with the PSN framework and EDED strategy for improving OCR systems in low-resource script.
Databáze: Directory of Open Access Journals