Probabilistic Ensemble Framework for Injury Narrative Classification

Autor: Srushti Vichare, Gaurav Nanda, Raji Sundararajan
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: AI, Vol 5, Iss 3, Pp 1684-1694 (2024)
Druh dokumentu: article
ISSN: 2673-2688
DOI: 10.3390/ai5030082
Popis: In this research, we analyzed narratives from the National Electronic Injury Surveillance System (NEISS) dataset to predict the top two injury codes using a comparative study of ensemble machine learning (ML) models. Four ensemble models were evaluated: Random Forest (RF) combined with Logistic Regression (LR), K-Nearest Neighbor (KNN) paired with RF, LR combined with KNN, and a model integrating LR, RF, and KNN, all utilizing a probabilistic likelihood-based approach to improve decision-making across different classifiers. The combined KNN + LR ensemble achieved an accuracy of 90.47% for the top one prediction, while the KNN + RF + LR model excelled in predicting the top two injury codes with a very high accuracy of 99.50%. These results demonstrate the significant potential of ensemble models to enhance unstructured narrative classification accuracy, particularly in addressing underrepresented cases, and the potential of the proposed probabilistic ensemble framework ML models in improving decision-making in public health and safety, providing a foundation for future research in automated clinical narrative classification and predictive modeling, especially in scenarios with imbalanced data.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje