APPLICATION OF MACHINE LEARNING ALGORITMS FOR HEART DISEASE PREDICTION

Autor: Anna I. Pavlova
Jazyk: English<br />Russian
Rok vydání: 2023
Předmět:
Zdroj: Siberian Journal of Life Sciences and Agriculture, Vol 15, Iss 3, Pp 475-496 (2023)
Druh dokumentu: article
ISSN: 2658-6649
2658-6657
DOI: 10.12731/2658-6649-2023-15-3-475-496
Popis: This paper focuses on the application of machine learning algorithms to predict cardiovascular diseases (CVDs). Every year a large number of deaths are registered all over the world. According to the World Health Organisation, CVDs are the leading cause of high mortality in the world. One of the necessary preventive measures to reduce mortality from CVDs is the timely prediction of diseases in people at high risk of such diseases. Specially developed scales and machine learning algorithms are now being used for the timely prediction of CVDs. Background. To predict heart disease, algorithms are often used: naive Bayesian classifier (Gaussian Naïve Bayes Classificator, GNBC), k-nearest neighbours (K-Nearest Neghboors, KNN), decision tree (Decision Tree, DT). In domestic literature, there are known works devoted to the application of SWD prediction using Adam gradient algorithm in deep neural network training. One of the necessary conditions for increasing the predictive ability of a machine learning model (MLM) is the optimal selection of hyperparameters. The choice of the optimal hyperparameters is often made on the basis of empirical experience. Purpose. To explore the specific application of machine algorithms to the prediction of heart disease. Materials and methods. Scientific novelty of the work. In this research we analyse machine learning algorithms for predicting the risk of CVDs using the approach of automatic search for hyperparameters MMO. The following algorithms were used to construct MMOs: NBS, KNN, DT, Logistic Regression, Support Vector Machine (SVM), Random Foorest (RF), Complement Naïve Bayes Classificator (CNBC), Linear Discriminant Analysis (LDA), Radial Basic Function (RBF), Gradient Boost (XGBoost). To evaluate the accuracy of machine learning models we used the following indicators: mean absolute error (MAE), precision, completeness (recall), F-measure (F-beta), False Positive Rate (FPR), False Negative Rate (FNR). Additionally, visual analysis of ROC curve (receiver operating characteristic) and areas under the curve (areas under the curve, AUC) were used to analyse the results of MMO. Using AUC value allows to estimate prognostic ability of MLM. Results. The training results showed that RF and XGBoost algorithms are characterized by higher accuracy. With optimal selection of MMO parameters, the overall classification accuracy was 0.88 and 0.94 respectively. Conclusion. The application of machine learning algorithms allows predictive models to be built with high accuracy. This requires the construction of a machine learning model. The ensemble machine learning algorithms RF and XGBoost have higher accuracy rates than the following algorithms: decision trees, Bayesian classification methods, logistic regression, linear discriminant analysis.
Databáze: Directory of Open Access Journals