Popis: |
Developing radiomic based machine learning models has drawn considerable attention in recent years. However, identifying a small and optimal feature vector to build a robust machine learning models has always been a controversial issue. In this study, we investigated the feasibility of applying a random projection algorithm to create an optimal feature vector from the CAD-generated large feature pool and improve the performance of the machine learning model. We assemble a retrospective dataset involving abdominal computed tomography (CT) images acquired from 188 patients diagnosed with gastric cancer. Among them, 141 cases have peritoneal metastasis (PM), while 47 cases do not have PM. A computer-aided detection (CAD) scheme is applied to segment the gastric tumor area and computes 325 image features. Then, two Logistic Regression models embedded with two different feature dimensionality reduction methods, namely, the principal component analysis (PCA) and a random projection algorithm (RPA). Afterward, a synthetic minority oversampling technique (SMOTE) is used to balance the dataset. The proposed ML model is built to predict the risk of the patients having advanced gastric cancer (AGC). All Logistic Regression models are trained and tested using a leave-one-case-out cross-validation method. Results show that the logistic regression embedded with RPA yielded a significantly higher AUC value (0.69±0.025) than using PCA (0.62±0.014) (p |