SMuRF: portable and accurate ensemble prediction of somatic mutations
Autor: | Mei Mei Chang, Yu Guo, Anders Jacobsen Skanderup, Karthik Muthukumar, Weitai Huang, Probhonjon Baruah |
---|---|
Rok vydání: | 2019 |
Předmět: |
Statistics and Probability
Computer science Computational biology Biochemistry Genome 03 medical and health sciences 0302 clinical medicine Germline mutation Gene Frequency Exome Indel Molecular Biology Allele frequency 030304 developmental biology Whole genome sequencing 0303 health sciences High-Throughput Nucleotide Sequencing Applications Notes Computer Science Applications Random forest Computational Mathematics Computational Theory and Mathematics 030220 oncology & carcinogenesis Mutation Mutation (genetic algorithm) Supervised Machine Learning Sequence Analysis |
Zdroj: | Bioinformatics |
ISSN: | 1460-2059 1367-4803 |
Popis: | Summary Somatic Mutation calling method using a Random Forest (SMuRF) integrates predictions and auxiliary features from multiple somatic mutation callers using a supervised machine learning approach. SMuRF is trained on community-curated matched tumor and normal whole genome sequencing data. SMuRF predicts both SNVs and indels with high accuracy in genome or exome-level sequencing data. Furthermore, the method is robust across multiple tested cancer types and predicts low allele frequency variants with high accuracy. In contrast to existing ensemble-based somatic mutation calling approaches, SMuRF works out-of-the-box and is orders of magnitudes faster. Availability and implementation The method is implemented in R and available at https://github.com/skandlab/SMuRF. SMuRF operates as an add-on to the community-developed bcbio-nextgen somatic variant calling pipeline. Supplementary information Supplementary data are available at Bioinformatics online. |
Databáze: | OpenAIRE |
Externí odkaz: |