Comparison of Classification Algorithm and Language Model in Accounting Financial Transaction Record: A Natural Language Processing Approach.

Autor: Makayasa, Bagas Adi, Siregar, Maria Ulfah, Sugiantoro, Bambang, Fatwanto, Agung
Předmět:
Zdroj: International Journal on Advanced Science, Engineering & Information Technology; 2024, Vol. 14 Issue 3, p880-886, 7p
Abstrakt: The problem of financial recording not following the principles of accounting science has the potential to cause unnecessary problems. However, micro, small, and medium enterprises with their distinctive characteristics, though not all, still face many obstacles in writing financial reports. Even though there is already much financial software available, our study aims to investigate opportunities for implementing automation of accounting financial transaction records using the NLP approach, to interpret financial transactions based on text written on the transaction form into accounting journals (debits and credits). Experiments were carried out by comparing the performance of three classification algorithms, namely SVM, K-Nearest Neighbor, and Random Forest, with traditional (TF-IDF and BOW) and contextual (Word2Vec) Language Models. There are 200 financial transaction datasets consisting of ten classes. The data is divided into two parts, namely, the balance dataset and the imbalance dataset. The pair SVM and Word2Vec in the balanced dataset gave the highest accuracy (92.5%), precision (92.5%), recall/sensitivity (93.33%), and F1 score (92%). However, compared with the results of related semantic research (the average performance reaches 95%), the results obtained in this study are still lower. One point that may have a significant effect is the amount of data in the corpus, which is still lacking. Researchers suggest increasing the number of datasets and using a combination of other language models such as Glove, Bert etc. This study can also be used as a model for more complex financial transaction cases in future research. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index