Construction of postoperative prognostic model for primary liver cancer based on SMOTE and machine learning

Autor: PAN Bi, YU Jinghu, HUANG Yixian
Jazyk: čínština
Rok vydání: 2024
Předmět:
Zdroj: 陆军军医大学学报, Vol 46, Iss 19, Pp 2236-2240 (2024)
Druh dokumentu: article
ISSN: 2097-0927
DOI: 10.16016/j.2097-0927.202310052
Popis: Objective To construct a prognosis prediction model of primary liver cancer after surgical treatment based on synthetic minority over-sampling technique(SMOTE) algorithm and machine learning model. Methods A retrospective cohort study was conducted on 4 297 patients with primary liver cancer from the surveillance, epidemiology, and end results(SEER) database. One-Hot Encoding and Multiple Imputation were used to preprocess the collect data, and SMOTE algorithm was employed to solve the imbalance of data categories. The obtained clinical variables were included in the machine learning model. Based on decision tree(DT), random forest(RF), gradient boosting decision tree(GBDT) and eXtreme Gradient Boosting(XGBoost), a prognostic prediction model(SMOTE+DT/RF/GBDT/XGBoost) was build, and then the best prediction model was determined by comparing the performance of various models. Finally, a prognostic analysis system for primary liver cancer was developed based on the optimal model, which was then visualized. Results The combination model SMOTE+RF showed the best predictive performance, with higher area under the curve(0.895), accuracy(0.811) and precision(0.806) than those of other models in receiver operating characteristic curve(ROC) analysis. Conclusion The SMOTE+RF prognostic prediction model can effectively predict the survival outcome of patients with primary liver cancer.
Databáze: Directory of Open Access Journals