Towards a comprehensive regulatory map of Mammalian Genomes.
Autor: | Gonçalves TM; Washington University School of Medicine., Stewart CL; University of Georgia., Baxley SD; University of Georgia., Xu J; Missouri University of Science & Technology., Li D; Washington University School of Medicine., Gabel HW; Washington University School of Medicine., Wang T; Washington University School of Medicine., Avraham O; University of Georgia., Zhao G; Washington University School of Medicine. |
---|---|
Jazyk: | angličtina |
Zdroj: | Research square [Res Sq] 2023 Sep 28. Date of Electronic Publication: 2023 Sep 28. |
DOI: | 10.21203/rs.3.rs-3294408/v1 |
Abstrakt: | Genome mapping studies have generated a nearly complete collection of genes for the human genome, but we still lack an equivalently vetted inventory of human regulatory sequences. Cis-regulatory modules (CRMs) play important roles in controlling when, where, and how much a gene is expressed. We developed a training data-free CRM-prediction algorithm, the Mammalian Regulatory MOdule Detector (MrMOD) for accurate CRM prediction in mammalian genomes. MrMOD provides genome position-fixed CRM models similar to the fixed gene models for the mouse and human genomes using only genomic sequences as the inputs with one adjustable parameter - the significance p-value. Importantly, MrMOD predicts a comprehensive set of high-resolution CRMs in the mouse and human genomes including all types of regulatory modules not limited to any tissue, cell type, developmental stage, or condition. We computationally validated MrMOD predictions used a compendium of 21 orthogonal experimental data sets including thousands of experimentally defined CRMs and millions of putative regulatory elements derived from hundreds of different tissues, cell types, and stimulus conditions obtained from multiple databases. In ovo transgenic reporter assay demonstrates the power of our prediction in guiding experimental design. We analyzed CRMs located in the chromosome 17 using unsupervised machine learning and identified groups of CRMs with multiple lines of evidence supporting their functionality, linking CRMs with upstream binding transcription factors and downstream target genes. Our work provides a comprehensive base pair resolution annotation of the functional regulatory elements and non-functional regions in the mammalian genomes. Competing Interests: Competing interests The authors have no conflicts of interest or financial ties to disclose. |
Databáze: | MEDLINE |
Externí odkaz: |