Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA
Autor: | Abraham Tzou, Jennifer Pecson, Tzu-Yu Liu, Signe Fransen, John St. John, David E. Weinberg, Riley Ennis, Yaping Liu, Brandon J. Rice, Daniel Delubac, Nathan Boley, Marvin Bertin, Katherine E. Niehaus, Leilani Young, Aarushi Sharma, Girish Putcha, Adam Drake, James Cregg, Erik Gafni, Nathan Wan, Catherina Tang, Derek Bowen, Brandon White, Imran S. Haque, Ajay Kannan, Mitch Bailey, Gabriel E. Sanderson, Eric A. Ariazi, Gabriel Otte, Loren Hansen |
---|---|
Rok vydání: | 2019 |
Předmět: |
0301 basic medicine
Male Cancer Research Colorectal cancer Plasma cell computer.software_genre Circulating Tumor DNA Machine Learning Cell-free DNA 0302 clinical medicine Surgical oncology Tumor stage Medicine Early-stage cancer Aged 80 and over 0303 health sciences Confounding Genomics Middle Aged lcsh:Neoplasms. Tumors. Oncology. Including cancer and carcinogens medicine.anatomical_structure Oncology Cell-free fetal DNA 030220 oncology & carcinogenesis Cohort Screening Female Colorectal Neoplasms Research Article Early detection Machine learning lcsh:RC254-282 Free dna 03 medical and health sciences Text mining Genetics Biomarkers Tumor Humans 030304 developmental biology Aged Neoplasm Staging Whole genome sequencing Whole-genome sequencing business.industry Genome Human Gene Expression Profiling Computational Biology Reproducibility of Results medicine.disease 030104 developmental biology ROC Curve Artificial intelligence business Transcriptome computer |
Zdroj: | BMC Cancer BMC Cancer, Vol 19, Iss 1, Pp 1-10 (2019) |
ISSN: | 1471-2407 |
Popis: | BackgroundBlood-based methods using cell-free DNA (cfDNA) are under development as an alternative to existing screening tests. However, early-stage detection of cancer using tumor-derived cfDNA has proven challenging because of the small proportion of cfDNA derived from tumor tissue in early-stage disease. A machine learning approach to discover signatures in cfDNA, potentially reflective of both tumor and non-tumor contributions, may represent a promising direction for the early detection of cancer.MethodsWhole-genome sequencing was performed on cfDNA extracted from plasma samples (N=546 colorectal cancer and 271 non-cancer controls). Reads aligning to protein-coding gene bodies were extracted, and read counts were normalized. cfDNA tumor fraction was estimated using IchorCNA. Machine learning models were trained using k-fold cross-validation and confounder-based cross-validation to assess generalization performance.ResultsIn a colorectal cancer cohort heavily weighted towards early-stage cancer (80% stage I/II), we achieved a mean AUC of 0.92 (95% CI 0.91-0.93) with a mean sensitivity of 85% (95% CI 83-86%) at 85% specificity. Sensitivity generally increased with tumor stage and increasing tumor fraction. Stratification by age, sequencing batch, and institution demonstrated the impact of these confounders and provided a more accurate assessment of generalization performance.ConclusionsA machine learning approach using cfDNA achieved high sensitivity and specificity in a large, predominantly early-stage, colorectal cancer cohort. The possibility of systematic technical and institution-specific biases warrants similar confounder analyses in other studies. Prospective validation of this machine learning method and evaluation of a multi-analyte approach are underway. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |