Popis: |
Cardiovascular disease (CVD) is a leading cause of death globally; therefore, early detection of CVD is crucial. Many intelligent technologies, including deep learning and machine learning (ML), are being integrated into healthcare systems for disease prediction. This paper uses a voting ensemble ML with chi-square feature selection to detect CVD early. Our approach involved applying multiple ML classifiers, including naïve Bayes, random forest, logistic regression (LR), and k-nearest neighbor. These classifiers were evaluated through metrics including accuracy, specificity, sensitivity, F1-score, confusion matrix, and area under the curve (AUC). We created an ensemble model by combining predictions from the different ML classifiers through a voting mechanism, whose performance was then measured against individual classifiers. Furthermore, we applied chi-square feature selection method to the 303 records across 13 clinical features in the Cleveland cardiac disease dataset to identify the 5 most important features. This approach improved the overall accuracy of our ensemble model and reduced the computational load considerably by more than 50%. Demonstrating superior effectiveness, our voting ensemble model achieved a remarkable accuracy of 92.11%, representing an average improvement of 2.95% over the single highest classifier (LR). These results indicate the ensemble method as a viable and practical approach to improve the accuracy of CVD prediction. |