Development and Evaluation of Novel Statistical Methods in Urine Biomarker-Based Hepatocellular Carcinoma Screening
Autor: | Dion Chen, Surbhi Jain, Ying Hsiu Su, Jeremy Wang, Wei Song, Chi Tan Hu |
---|---|
Rok vydání: | 2017 |
Předmět: |
0301 basic medicine
Oncology Adult Male medicine.medical_specialty Multivariate analysis Biometry Carcinoma Hepatocellular lcsh:Medicine Logistic regression Article Machine Learning 03 medical and health sciences Young Adult Internal medicine medicine Biomarkers Tumor Humans Mass Screening lcsh:Science Survival rate Aged Aged 80 and over Multidisciplinary Receiver operating characteristic business.industry lcsh:R Liver Neoplasms Middle Aged medicine.disease Regression 3. Good health Random forest 030104 developmental biology Hepatocellular carcinoma Area Under Curve Multivariate Analysis Biomarker (medicine) Regression Analysis lcsh:Q Female business |
Zdroj: | Scientific Reports Scientific Reports, Vol 8, Iss 1, Pp 1-8 (2018) |
ISSN: | 2045-2322 |
Popis: | Hepatocellular carcinoma is one of the fastest growing cancers in the US and has a low survival rate, partly due to difficulties in early detection. Because of HCC’s high heterogeneity, it has been suggested that multiple biomarkers would be needed to develop a sensitive HCC screening test. This study applied random forest (RF), a machine learning technique, and proposed two novel models, fixed sequential (FS) and two-step (TS), for comparison with two commonly used statistical techniques, logistic regression (LR) and classification and regression trees (CART), in combining multiple urine DNA biomarkers for HCC screening using biomarker values obtained from 137 HCC and 431 non-HCC (224 hepatitis and 207 cirrhosis) subjects. The sensitivity, specificity, area under the receiver operating curve, and variability were estimated through repeated 10-fold cross-validation to compare the models’ performances in accuracy and robustness. We show that RF and TS have higher accuracy and stability; specifically, they reach 90% specificity and 86%/87% sensitivity respectively along with 15% higher sensitivity and 10% higher specificity than LR in cross-validation. The potential of RF and TS to develop a panel of multiple biomarkers and the possibility for self-training, cloud-based models for HCC screening are discussed. |
Databáze: | OpenAIRE |
Externí odkaz: |