An Improved Ensemble-Based Cardiovascular Disease Detection System with Chi-Square Feature Selection

Autor: Ayad E. Korial, Ivan Isho Gorial, Amjad J. Humaidi
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Computers, Vol 13, Iss 6, p 126 (2024)
Druh dokumentu: article
ISSN: 2073-431X
DOI: 10.3390/computers13060126
Popis: Cardiovascular disease (CVD) is a leading cause of death globally; therefore, early detection of CVD is crucial. Many intelligent technologies, including deep learning and machine learning (ML), are being integrated into healthcare systems for disease prediction. This paper uses a voting ensemble ML with chi-square feature selection to detect CVD early. Our approach involved applying multiple ML classifiers, including naïve Bayes, random forest, logistic regression (LR), and k-nearest neighbor. These classifiers were evaluated through metrics including accuracy, specificity, sensitivity, F1-score, confusion matrix, and area under the curve (AUC). We created an ensemble model by combining predictions from the different ML classifiers through a voting mechanism, whose performance was then measured against individual classifiers. Furthermore, we applied chi-square feature selection method to the 303 records across 13 clinical features in the Cleveland cardiac disease dataset to identify the 5 most important features. This approach improved the overall accuracy of our ensemble model and reduced the computational load considerably by more than 50%. Demonstrating superior effectiveness, our voting ensemble model achieved a remarkable accuracy of 92.11%, representing an average improvement of 2.95% over the single highest classifier (LR). These results indicate the ensemble method as a viable and practical approach to improve the accuracy of CVD prediction.
Databáze: Directory of Open Access Journals