Popis: |
Aortic dissection (AD) is a rare and high-risk cardiovascular disease with high mortality. Due to its complex and changeable clinical manifestations, it is easily missed or misdiagnosed. In this paper, we proposed an ensemble learning model based on clustering: Cluster Random under-sampling Smote–Tomek Bagging (CRST-Bagging) to help clinicians screen for AD patients in the early phase to save their lives. In this model, we propose the CRST method, which combines the advantages of Kmeans++ and the Smote–Tomek sampling method, to overcome an extremely imbalanced AD dataset. Then we used the Bagging algorithm to predict the AD patients. We collected AD patients’ and other cardiovascular patients’ routine examination data from Xiangya Hospital to build the AD dataset. The effectiveness of the CRST method in resampling was verified by experiments on the original AD dataset. Our model was compared with RUSBoost and SMOTEBagging on the original dataset and a test dataset. The results show that our model performed better. On the test dataset, our model’s precision and recall rates were 83.6% and 80.7%, respectively. Our model’s F1-score was 82.1%, which is 4.8% and 1.6% higher than that of RUSBoost and SMOTEBagging, which demonstrates our model’s effectiveness in AD screening. |