Structural variation across 138,134 samples in the TOPMed consortium

Autor: Goo Jun, Adam English, Ginger Metcalf, Jianzhi Yang, Mark Chaisson, Nathan Pankratz, Vipin Menon, William Salerno, Olga Krasheninina, Albert Smith, John Lane, Thomas Blackwell, Hyun Min Kang, Sejal Salvi, Qingchang Meng, Hua Shen, Divya Pasham, Sravya Bhamidipati, Kavya Kottapalli, Donna Arnett, Allison Ashley-Koch, Paul Auer, KAthleen Beutel, Joshua Bis, John Blangero, Donald Bowden, Jennifer Brody, Brian Cade, Yii-Der Ida Chen, Michael Cho, Joanne Curran, Myriam Fornage, Barry Frredman, Tasha Fingerlin, Bruce Gelb, Lifang Hou, Yi-Jen Hung, John P Kane, Robert Kaplan, Wonji Kim, Ruth Loos, Gregory Marcus, Rasika Mathias, Stephen McGarvey, Courtney Montgomery, Take Naseri, Seyed Nouraie, Michael Preuss, Nicholette Palmer, Patricia Peyser, Laura Raffield, Aakrosh Ratan, Susan Redline, Muagututia Reupena, Jerome Rotter, Stephen Rich, Michiel Rienstra, Ingo Ruczinski, Vijay Sankaran, David Schwartz, Christine Seidman, Jonathan Seidman, Edwin Silverman, Jennifer Smith, Adrienne Stilp, Kent Taylor, Marilyn Telen, Scott Weiss, L. Keoki Williams, Baojun Wu, Lisa Yanek, Yingze Zhang, Jessica Lasky-Su, Marie-Claude Gingras, Susan Dutcher, Evan Eichler, Stacey Gabriel, Soren Germer, Ryan Kim, Karine Martinez, Deborah Nickerson, James Luo, Alexander Reiner, Richard Gibbs, Eric Boerwinkle, Goncaol Abecasis, Fritz Sedlazeck
Rok vydání: 2023
Předmět:
Zdroj: bioRxiv
Popis: Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.34% novel) across autosomes and the X chromosome (50bp+) from 138,134 individuals in the diverse TOPMed consortium. We describe our methodologies for SV inference resulting in high variant quality and >90% allele concordance compared to long-read de-novo assemblies of well-characterized control samples. We demonstrate utility through significant associations between SVs and important various cardio-metabolic and hematologic traits. We have identified 690 SV hotspots and deserts and those that potentially impact the regulation of medically relevant genes. This catalog characterizes SVs across multiple populations and will serve as a valuable tool to understand the impact of SV on disease development and progression.
Databáze: OpenAIRE