Assessing reproducibility of inherited variants detected with short-read whole genome sequencing.

Autor: Pan B; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Ren L; State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.; Human Phenome Institute, Fudan University, Shanghai, 200438, China., Onuchic V; Illumina Inc., San Diego, CA, 92122, USA., Guan M; SAS Institute Inc., Cary, NC, 27513, USA., Kusko R; Immuneering Corporation, Cambridge, MA, 02142, USA., Bruinsma S; Illumina Inc., San Diego, CA, 92122, USA., Trigg L; Real Time Genomics, Hamilton, New Zealand., Scherer A; Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland.; EATRIS ERIC- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands., Ning B; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Zhang C; School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA., Glidewell-Kenney C; Illumina Inc., San Diego, CA, 92122, USA., Xiao C; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA., Donaldson E; Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, 20993, USA., Sedlazeck FJ; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA., Schroth G; Illumina Inc., San Diego, CA, 92122, USA., Yavas G; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Grunenwald H; Illumina Inc., San Diego, CA, 92122, USA., Chen H; Sentieon Inc., San Jose, CA, 95134, USA., Meinholz H; Illumina Inc., San Diego, CA, 92122, USA., Meehan J; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Wang J; Center for Advanced Measurement Science, National Institute of Metrology, Beijing, 100013, China., Yang J; State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.; Human Phenome Institute, Fudan University, Shanghai, 200438, China., Foox J; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA., Shang J; State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.; Human Phenome Institute, Fudan University, Shanghai, 200438, China., Miclaus K; SAS Institute Inc., Cary, NC, 27513, USA., Dong L; Center for Advanced Measurement Science, National Institute of Metrology, Beijing, 100013, China., Shi L; State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.; Human Phenome Institute, Fudan University, Shanghai, 200438, China., Mohiyuddin M; Roche Sequencing Solutions, Santa Clara, CA, 95050, USA., Pirooznia M; Bioinformatics and Computational Biology Laboratory, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA., Gong P; Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA., Golshani R; Illumina Inc., San Diego, CA, 92122, USA., Wolfinger R; SAS Institute Inc., Cary, NC, 27513, USA., Lababidi S; Office of Health Informatics, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, 20993, USA., Sahraeian SME; Roche Sequencing Solutions, Santa Clara, CA, 95050, USA., Sherry S; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA., Han T; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Chen T; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Shi T; The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China., Hou W; State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.; Human Phenome Institute, Fudan University, Shanghai, 200438, China., Ge W; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Zou W; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Guo W; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Bao W; SAS Institute Inc., Cary, NC, 27513, USA., Xiao W; Stanford Genome Technology Center, Stanford University School of Medicine, Palo Alto, CA, 94305, USA., Fan X; Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China., Gondo Y; Department of Molecular Life Sciences, Tokai University School of Medicine, 143 Shimokasuya, Isehara, 259-1193, Japan., Yu Y; State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.; Human Phenome Institute, Fudan University, Shanghai, 200438, China., Zhao Y; CCR-SF Bioinformatics Group, Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science, Frederick National Laboratory for Cancer Research, Frederick, MD, 21701, USA., Su Z; Takeda Pharmaceuticals, Cambridge, MA, 02139, USA., Liu Z; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Tong W; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA., Xiao W; Division of Molecular Genetics and Pathology, Center for Device and Radiological Health, US Food and Drug Administration, Silver Spring, MD, 20993, USA., Zook JM; Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA. justin.zook@nist.gov., Zheng Y; State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China. zhengyuanting@fudan.edu.cn.; Human Phenome Institute, Fudan University, Shanghai, 200438, China. zhengyuanting@fudan.edu.cn., Hong H; Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA. huixiao.hong@fda.hhs.gov.
Jazyk: angličtina
Zdroj: Genome biology [Genome Biol] 2022 Jan 03; Vol. 23 (1), pp. 2. Date of Electronic Publication: 2022 Jan 03.
DOI: 10.1186/s13059-021-02569-8
Abstrakt: Background: Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS.
Results: To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×.
Conclusions: Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.
(© 2021. The Author(s).)
Databáze: MEDLINE