Machine Learning-Based E-Archive for Archives Management of South Sumatra Province
Autor: | Toni Tri Atmojo, Yesi Novaria Kunang |
---|---|
Jazyk: | English<br />Indonesian |
Rok vydání: | 2023 |
Předmět: | |
Zdroj: | Journal of Information Systems and Informatics, Vol 5, Iss 4, Pp 1491-1507 (2023) |
Druh dokumentu: | article |
ISSN: | 2656-5935 2656-4882 |
DOI: | 10.51519/journalisi.v5i4.566 |
Popis: | Archives play a crucial role in institutional operations, yet efficiently retrieving specific information from them can be challenging. This research addresses this issue by developing an information retrieval system that incorporates advanced methods to enhance search efficiency. The system employs the TF-IDF (Term Frequency-Inverse Document Frequency) formula, which assesses the significance of a word within a document set, and the BM25 method, a sophisticated algorithm for ranking documents based on their relevance to the input query. Both methods undergo a preprocessing stage, enabling the system to calculate the relevance of each document to the given query accurately. The effectiveness of this system is evaluated using key performance metrics: precision (accuracy), recall (completeness), and the F1 Score (the harmonic means of precision and recall, representing the best value). Testing with various keywords revealed that the BM25 method yielded impressive results, achieving an average precision of 0.75, recall of 0.6, and an F1 Score of 0.6665. In contrast, the TF-IDF method scored lower, with a precision of 0.33, recall of 0.2, and an F1 Score of 0.2500. The system was tested using a dataset of 350 documents. |
Databáze: | Directory of Open Access Journals |
Externí odkaz: |