Structural variant-based pangenome construction has low sensitivity to variability of haplotype-resolved bovine assemblies
Autor: | Timothy P. L. Smith, Hubert Pausch, C. Herrera, Michael P. Heaton, Derek M. Bickhart, Kristen L. Kuhn, Benjamin D. Rosen, Brian L. Vander Ley, Danang Crysnanto, Alexander S. Leonard, Zih-Hua Fang, Heinrich Bollwein |
---|---|
Přispěvatelé: | University of Zurich, Leonard, Alexander S, Rosen, Benjamin D, Pausch, Hubert |
Rok vydání: | 2022 |
Předmět: |
1000 Multidisciplinary
Multidisciplinary Contiguity General Physics and Astronomy 1600 General Chemistry Genetics and Molecular Biology Computational biology General Chemistry Biology Phenotype 3100 General Physics and Astronomy General Biochemistry Genetics and Molecular Biology Loss of heterozygosity Structural variation 10187 Department of Farm Animals 1300 General Biochemistry Genetics and Molecular Biology Centromere General Biochemistry 570 Life sciences biology PRDM9 Sequence (medicine) Reference genome |
Zdroj: | Nature Communications, 13 (1) |
ISSN: | 2041-1723 |
Popis: | Advantages of pangenomes over linear reference assemblies for genome research have recently been established. However, potential effects of sequence platform and assembly approach, or of combining assemblies created by different approaches, on pangenome construction have not been investigated. Here we generate haplotype-resolved assemblies from the offspring of three bovine trios representing increasing levels of heterozygosity that each demonstrate a substantial improvement in contiguity, completeness, and accuracy over the current Bos taurus reference genome. Diploid coverage as low as 20x for HiFi or 60x for ONT is sufficient to produce two haplotype-resolved assemblies meeting standards set by the Vertebrate Genomes Project. Structural variant-based pangenomes created from the haplotype-resolved assemblies demonstrate significant consensus regardless of sequence platform, assembler algorithm, or coverage. Inspecting pangenome topologies identifies 90 thousand structural variants including 931 overlapping with coding sequences; this approach reveals variants affecting QRICH2, PRDM9, HSPA1A, TAS2R46, and GC that have potential to affect phenotype. Nature Communications, 13 (1) ISSN:2041-1723 |
Databáze: | OpenAIRE |
Externí odkaz: |