A deep catalogue of protein-coding variation in 983,578 individuals.

Autor: Sun KY; Regeneron Genetics Center, Tarrytown, NY, USA., Bai X; Regeneron Genetics Center, Tarrytown, NY, USA., Chen S; Regeneron Genetics Center, Tarrytown, NY, USA., Bao S; Regeneron Genetics Center, Tarrytown, NY, USA., Zhang C; Regeneron Genetics Center, Tarrytown, NY, USA., Kapoor M; Regeneron Genetics Center, Tarrytown, NY, USA., Backman J; Regeneron Genetics Center, Tarrytown, NY, USA., Joseph T; Regeneron Genetics Center, Tarrytown, NY, USA., Maxwell E; Regeneron Genetics Center, Tarrytown, NY, USA., Mitra G; Regeneron Genetics Center, Tarrytown, NY, USA., Gorovits A; Regeneron Genetics Center, Tarrytown, NY, USA., Mansfield A; Regeneron Genetics Center, Tarrytown, NY, USA., Boutkov B; Regeneron Genetics Center, Tarrytown, NY, USA., Gokhale S; Regeneron Genetics Center, Tarrytown, NY, USA., Habegger L; Regeneron Genetics Center, Tarrytown, NY, USA., Marcketta A; Regeneron Genetics Center, Tarrytown, NY, USA., Locke AE; Regeneron Genetics Center, Tarrytown, NY, USA., Ganel L; Regeneron Genetics Center, Tarrytown, NY, USA., Hawes A; Regeneron Genetics Center, Tarrytown, NY, USA., Kessler MD; Regeneron Genetics Center, Tarrytown, NY, USA., Sharma D; Regeneron Genetics Center, Tarrytown, NY, USA., Staples J; Regeneron Genetics Center, Tarrytown, NY, USA., Bovijn J; Regeneron Genetics Center, Tarrytown, NY, USA., Gelfman S; Regeneron Genetics Center, Tarrytown, NY, USA., Di Gioia A; Regeneron Genetics Center, Tarrytown, NY, USA., Rajagopal VM; Regeneron Genetics Center, Tarrytown, NY, USA., Lopez A; Regeneron Genetics Center, Tarrytown, NY, USA., Varela JR; Regeneron Genetics Center, Tarrytown, NY, USA., Alegre-Díaz J; Faculty of Medicine, National Autonomous University of Mexico (UNAM), Mexico City, Mexico., Berumen J; Faculty of Medicine, National Autonomous University of Mexico (UNAM), Mexico City, Mexico., Tapia-Conyer R; Faculty of Medicine, National Autonomous University of Mexico (UNAM), Mexico City, Mexico., Kuri-Morales P; Faculty of Medicine, National Autonomous University of Mexico (UNAM), Mexico City, Mexico.; Instituto Tecnológico y de Estudios Superiores de Monterrey, Monterrey, Mexico., Torres J; Clinical Trial Service Unit and Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK., Emberson J; Clinical Trial Service Unit and Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK., Collins R; Clinical Trial Service Unit and Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK., Cantor M; Regeneron Genetics Center, Tarrytown, NY, USA., Thornton T; Regeneron Genetics Center, Tarrytown, NY, USA., Kang HM; Regeneron Genetics Center, Tarrytown, NY, USA., Overton JD; Regeneron Genetics Center, Tarrytown, NY, USA., Shuldiner AR; Regeneron Genetics Center, Tarrytown, NY, USA., Cremona ML; Regeneron Genetics Center, Tarrytown, NY, USA., Nafde M; Regeneron Genetics Center, Tarrytown, NY, USA., Baras A; Regeneron Genetics Center, Tarrytown, NY, USA., Abecasis G; Regeneron Genetics Center, Tarrytown, NY, USA., Marchini J; Regeneron Genetics Center, Tarrytown, NY, USA., Reid JG; Regeneron Genetics Center, Tarrytown, NY, USA., Salerno W; Regeneron Genetics Center, Tarrytown, NY, USA. william.salerno@regeneron.com., Balasubramanian S; Regeneron Genetics Center, Tarrytown, NY, USA. suganthi.bala@regeneron.com.
Jazyk: angličtina
Zdroj: Nature [Nature] 2024 Jul; Vol. 631 (8021), pp. 583-592. Date of Electronic Publication: 2024 May 20.
DOI: 10.1038/s41586-024-07556-0
Abstrakt: Rare coding variants that substantially affect function provide insights into the biology of a gene 1-3 . However, ascertaining the frequency of such variants requires large sample sizes 4-8 . Here we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. In total, 23% of the Regeneron Genetics Center Million Exome (RGC-ME) data come from individuals of African, East Asian, Indigenous American, Middle Eastern and South Asian ancestry. The catalogue includes more than 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss of function (LOF), we identify 3,988 LOF-intolerant genes, including 86 that were previously assessed as tolerant and 1,153 that lack established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions that are depleted of missense variants despite being tolerant of pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this resource of coding variation from the RGC-ME dataset publicly accessible through a variant allele frequency browser.
(© 2024. The Author(s).)
Databáze: MEDLINE