Integrated variant allele frequency analysis pipeline and R package: easyVAF.

Autor: Hu J; Biostatistics Shared Resource (RRID: SCR_021981), University of Colorado Cancer Center, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA.; Department of Pediatrics, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA., Alami V; Biostatistics Shared Resource (RRID: SCR_021981), University of Colorado Cancer Center, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA., Zhuang Y; Biostatistics Shared Resource (RRID: SCR_021981), University of Colorado Cancer Center, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA.; Department of Pediatrics, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA., Alzofon N; Division of Medical Oncology, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA., Jimeno A; Division of Medical Oncology, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA., Gao D; Biostatistics Shared Resource (RRID: SCR_021981), University of Colorado Cancer Center, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA.; Department of Pediatrics, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA.
Jazyk: angličtina
Zdroj: Molecular carcinogenesis [Mol Carcinog] 2023 Dec; Vol. 62 (12), pp. 1877-1887. Date of Electronic Publication: 2023 Aug 22.
DOI: 10.1002/mc.23621
Abstrakt: Somatic sequence variants are associated with cancer diagnosis, prognostic stratification, and treatment response. Variant allele frequency (VAF), the percentage of sequence reads with a specific DNA variant over the read depth at that locus, has been used as a metric to quantify mutation rates in these applications. VAF has the potential for feature detection by reflecting changes in tumor clonal composition across treatments or time points. Although there are several packages, including Genome Analysis Toolkit and VarScan, designed for variant calling and rare mutation identification, there is no readily available package for comparing VAFs among and between groups to identify loci of interest. To this end, we have developed the R package easyVAF, which includes parametric and nonparametric tests to compare VAFs among multiple groups. It is accompanied by an interactive R Shiny app. With easyVAF, the investigator has the option between three statistical tests to maximize power while maintaining an acceptable type I error rate. This paper presents our proposed pipeline for VAF analysis, from quality checking to group comparison. We evaluate our method in a wide range of simulated scenarios and show that choosing the appropriate test to limit the type I error rate is critical. For situations where data is sparse, we recommend comparing VAFs with the beta-binomial likelihood ratio test over Fisher's exact test and Pearson's χ 2 test.
(© 2023 Wiley Periodicals LLC.)
Databáze: MEDLINE