A Machine Learning Model for Prostate Cancer Prediction in Korean Men

Autor: Sukjung Choi, Beomgi So, Shane Oh, Hongzoo Park, Sang Wook Lee, Geehyun Song, Jong Min Lee, Jung Ki Jo, Seon Hyeok Kim, Si Eun Lee, Eun-Bi Cho, Jae Hung Jung, Jeong Hyun Kim
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Journal of Urologic Oncology, Vol 22, Iss 3, Pp 201-210 (2024)
Druh dokumentu: article
ISSN: 2951-603X
2982-7043
DOI: 10.22465/juo.244800400020
Popis: Purpose Unnecessary prostate biopsies for detecting prostate cancer (PCa) should be minimized. Therefore, this study developed a machine learning (ML) model to predict PCa in Korean men and evaluated its usability. Materials and Methods We retrospectively analyzed clinical data from 928 patients who underwent prostate biopsies at Kangwon National University Hospital between May 2013 and May 2023. Of these, 377 (41.6%) were diagnosed with PCa, and 551 (59.4%) did not have cancer. For external validation, clinical data from 385 patients aged 48–89 years who underwent prostate biopsies from September 2005 to September 2023 at Wonju Severance Christian Hospital were also included. Twenty-two clinical features were used to develop an ML model to predict PCa. Features were selected based on their contributions to model performance, leading to the inclusion of 15 features. A meta-learner was constructed using logistic regression to predict the probability of PCa, and the classifier was trained and validated on randomly extracted training and test sets at an 8:2 ratio. Results The prostate health index, prostate volume, age, nodule on digital rectal examination, and prostate-specific antigen were the top 5 features for predicting PCa. The area under the receiver operating characteristic curve (AUC) of the meta-learner logistic regression model was 0.89, and the accuracy, sensitivity, and specificity were 0.828, 0.711, and 0.909, respectively. Our model also showed excellent prediction performance for high-grade PCa, with a Gleason score of 7 or higher and an AUC of 0.903. Furthermore, we evaluated the performance of the model using external cohort clinical data and achieved an AUC of 0.863. Conclusions Our ML model excelled in predicting PCa, specifically clinically significant PCa. Although extensive cross-validation in other clinical cohorts is needed, this ML model is a promising option for future diagnostics.
Databáze: Directory of Open Access Journals