Identifying metabolic dysfunction-associated steatotic liver disease in patients with hypertension and pre-hypertension: An interpretable machine learning approach

Autor:	Chen Chen, Wenkang Zhang, Gaoliang Yan, Chengchun Tang
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Computer applications to medicine. Medical informatics R858-859.7
Zdroj:	Digital Health, Vol 10 (2024)
Druh dokumentu:	article
ISSN:	2055-2076 20552076
DOI:	10.1177/20552076241233135
Popis:	Objective Metabolic dysfunction-associated steatotic liver disease (MASLD) is one of the most prevalent liver diseases and is associated with pre-hypertension and hypertension. Our research aims to develop interpretable machine learning (ML) models to accurately identify MASLD in hypertensive and pre-hypertensive populations. Methods The dataset for 4722 hypertensive and pre-hypertensive patients is from subjects in the NAGALA study. Six ML models, including the decision tree, K-nearest neighbor, gradient boosting, naive Bayes, support vector machine, and random forest (RF) models, were used in this study. The optimal model was constructed according to the performances of models evaluated by K-fold cross-validation ( k = 5), the area under the receiver operating characteristic curve (AUC), average precision (AP), accuracy, sensitivity, specificity, and F1. Shapley additive explanation (SHAP) values were employed for both global and local interpretation of the model results. Results The prevalence of MASLD in hypertensive and pre-hypertensive patients was 44.3% (362 cases) and 28.3% (1107 cases), respectively. The RF model outperformed the other five models with an AUC of 0.889, AP of 0.800, accuracy of 0.819, sensitivity of 0.816, specificity of 0.821, and F1 of 0.729. According to the SHAP analysis, the top five important features were alanine aminotransferase, body mass index, waist circumference, high-density lipoprotein cholesterol, and total cholesterol. Further analysis of the feature selection in the RF model revealed that incorporating all features leads to optimal model performance. Conclusions ML algorithms, especially RF algorithm, improve the accuracy of MASLD identification, and the global and local interpretation of the RF model results enables us to intuitively understand how various features affect the chances of MASLD in patients with hypertension and pre-hypertension.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/d92d4c498be3497e95011eec27a77330 Zobrazit plný text záznamu View record in DOAJ