The case for using mapped exonic non-duplicate reads when reporting RNA-sequencing depth: examples from pediatric cancer datasets.
Autor: | Beale HC; UC Santa Cruz, Molecular, Cell and Developmental Biology, 1156 High Street, Santa Cruz, CA 95064, USA.; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Roger JM; UC Santa Cruz, School of Engineering, 1156 High Street, Santa Cruz, CA 95064, USA., Cattle MA; UC Santa Cruz, School of Engineering, 1156 High Street, Santa Cruz, CA 95064, USA., McKay LT; UC Santa Cruz, School of Engineering, 1156 High Street, Santa Cruz, CA 95064, USA., Thompson DKA; UC Santa Cruz, School of Engineering, 1156 High Street, Santa Cruz, CA 95064, USA., Learned K; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Lyle AG; UC Santa Cruz, Molecular, Cell and Developmental Biology, 1156 High Street, Santa Cruz, CA 95064, USA.; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Kephart ET; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Currie R; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Lam DL; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Sanders L; UC Santa Cruz, Molecular, Cell and Developmental Biology, 1156 High Street, Santa Cruz, CA 95064, USA., Pfeil J; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Vivian J; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Bjork I; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Salama SR; UC Santa Cruz, Department of Biomolecular Engineering, 1156 High Street, Santa Cruz, CA 95064, USA.; Howard Hughes Medical Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Haussler D; UC Santa Cruz, Department of Biomolecular Engineering, 1156 High Street, Santa Cruz, CA 95064, USA.; Howard Hughes Medical Institute, 1156 High Street, Santa Cruz, CA 95064, USA., Vaske OM; UC Santa Cruz, Molecular, Cell and Developmental Biology, 1156 High Street, Santa Cruz, CA 95064, USA.; UC Santa Cruz, Genomics Institute, 1156 High Street, Santa Cruz, CA 95064, USA. |
---|---|
Jazyk: | angličtina |
Zdroj: | GigaScience [Gigascience] 2021 Mar 13; Vol. 10 (3). |
DOI: | 10.1093/gigascience/giab011 |
Abstrakt: | Background: The reproducibility of gene expression measured by RNA sequencing (RNA-Seq) is dependent on the sequencing depth. While unmapped or non-exonic reads do not contribute to gene expression quantification, duplicate reads contribute to the quantification but are not informative for reproducibility. We show that mapped, exonic, non-duplicate (MEND) reads are a useful measure of reproducibility of RNA-Seq datasets used for gene expression analysis. Findings: In bulk RNA-Seq datasets from 2,179 tumors in 48 cohorts, the fraction of reads that contribute to the reproducibility of gene expression analysis varies greatly. Unmapped reads constitute 1-77% of all reads (median [IQR], 3% [3-6%]); duplicate reads constitute 3-100% of mapped reads (median [IQR], 27% [13-43%]); and non-exonic reads constitute 4-97% of mapped, non-duplicate reads (median [IQR], 25% [16-37%]). MEND reads constitute 0-79% of total reads (median [IQR], 50% [30-61%]). Conclusions: Because not all reads in an RNA-Seq dataset are informative for reproducibility of gene expression measurements and the fraction of reads that are informative varies, we propose reporting a dataset's sequencing depth in MEND reads, which definitively inform the reproducibility of gene expression, rather than total, mapped, or exonic reads. We provide a Docker image containing (i) the existing required tools (RSeQC, sambamba, and samblaster) and (ii) a custom script to calculate MEND reads from RNA-Seq data files. We recommend that all RNA-Seq gene expression experiments, sensitivity studies, and depth recommendations use MEND units for sequencing depth. (© The Author(s) 2021. Published by Oxford University Press GigaScience.) |
Databáze: | MEDLINE |
Externí odkaz: |