Construction and evaluation of a new rat reference genome assembly, GRCr8, from long reads and long-range scaffolding.

Autor: Li K; Gluck Equine Genomics Center, University of Kentucky, Lexington, Kentucky 40503, USA., Smith ML; Department of Biochemistry and Molecular Biology, University of Louisville School of Medicine, Louisville, Kentucky 40202, USA., Blazier JC; Texas A&M Institute for Genome Sciences and Society, Texas A&M University, College Station, Texas 77843, USA., Kochan KJ; Texas A&M Institute for Genome Sciences and Society, Texas A&M University, College Station, Texas 77843, USA., Wood JMD; Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, United Kingdom., Howe K; Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, United Kingdom., Kwitek AE; Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA., Dwinell MR; Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA., Chen H; Department of Pharmacology, University of Tennessee Health Sciences Center, Memphis, Tennessee 38163, USA., Ciosek JL; Gluck Equine Genomics Center, University of Kentucky, Lexington, Kentucky 40503, USA., Masterson P; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA., Murphy TD; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA., Kalbfleisch TS; Gluck Equine Genomics Center, University of Kentucky, Lexington, Kentucky 40503, USA., Doris PA; Center for Human Genetics, Brown Foundation Institute of Molecular Medicine, McGovern Medical School, University of Texas Health Science Center, Houston, Texas 77030, USA peter.a.doris@uth.tmc.edu.
Jazyk: angličtina
Zdroj: Genome research [Genome Res] 2024 Nov 20; Vol. 34 (11), pp. 2081-2093. Date of Electronic Publication: 2024 Nov 20.
DOI: 10.1101/gr.279292.124
Abstrakt: We report the construction and analysis of a new reference genome assembly for Rattus norvegicus , the laboratory rat, a widely used experimental animal model organism. The assembly has been adopted as the rat reference assembly by the Genome Reference Consortium and is named GRCr8. The assembly has employed 40× Pacific Biosciences (PacBio) HiFi sequencing coverage and scaffolding using optical mapping and Hi-C. We used genomic DNA from a male BN/NHsdMcwi (BN) rat of the same strain and from the same colony as the prior reference assembly, mRatBN7.2. The assembly is at chromosome level with 98.7% of the sequence assigned to chromosomes. All chromosomes have increased in size compared with the prior assembly and k -mer analysis indicates that the subject animal is fully inbred and that the genome is represented as a single haploid assembly. Notable increases are observed in Chromosomes 3, 11, and 12 in the prospective rDNA regions. In addition, Chr Y has increased threefold in size and is more consistent with the rat karyotype than previous assemblies. Several other chromosomes have grown by the incorporation of sizable discrete new blocks. These contain highly repetitive sequences and encode numerous previously unannotated genes. In addition, centromeric sequences are incorporated in most chromosomes. Genome annotation has been performed by NCBI RefSeq, which confirms improvement in assembly quality and adds more than 1100 new protein coding genes. PacBio Iso-Seq data have been acquired from multiple tissues of the subject animal and are released concurrently with the new assembly to aid further analyses.
(© 2024 Li et al.; Published by Cold Spring Harbor Laboratory Press.)
Databáze: MEDLINE