EXAMINATION OF SUMMARIZED MEDICAL RECORDS FOR ICD CODE CLASSIFICATION VIA BERT

Autor: Dilek AYDOGAN-KILIC, Deniz Kenan KILIC, Izabela Ewa NIELSEN
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Applied Computer Science, Vol 20, Iss 2 (2024)
Druh dokumentu: article
ISSN: 2353-6977
DOI: 10.35784/acs-2024-16
Popis: The International Classification of Diseases (ICD) is utilized by member countries of the World Health Organization (WHO). It is a critical system to ensure worldwide standardization of diagnosis codes, which enables data comparison and analysis across various nations. The ICD system is essential in supporting payment systems, healthcare research, service planning, and quality and safety management. However, the sophisticated and intricate structure of the ICD system can sometimes cause issues such as longer examination times, increased training expenses, a greater need for human resources, problems with payment systems due to inaccurate coding, and unreliable data in health research. Additionally, machine learning models that use automated ICD systems face difficulties with lengthy medical notes. To tackle this challenge, the present study aims to utilize Medical Information Mart for Intensive Care (MIMIC-III) medical notes that have been summarized using the term frequency-inverse document frequency (TF-IDF) method. These notes are further analyzed using deep learning, specifically bidirectional encoder representations from transformers (BERT), to classify disease diagnoses based on ICD codes. Even though the proposed methodology using summarized data provides lower accuracy performance than state-of-the-art methods, the performance results obtained are promising in terms of continuing the study of extracting summary input and more important features, as it provides real-time ICD code classification and more explainable inputs.
Databáze: Directory of Open Access Journals