Comparative validation of the D. melanogaster modENCODE transcriptome annotation.

Autor: Chen ZX; National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Sturgill D; National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Qu J; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Jiang H; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Park S; Department of Genome Dynamics, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA;, Boley N; Department of Statistics, University of California, Berkeley, California 94720, USA;, Suzuki AM; Technology Development Group, RIKEN Omics Science Center and RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama City, Kanagawa, Japan 230-0045;, Fletcher AR; Division of Computational Bioscience, Center For Information Technology, National Institutes of Health, Bethesda, Maryland 20814, USA;, Plachetzki DC; Department of Evolution and Ecology, University of California, Davis, California 95616, USA;, FitzGerald PC; National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA;, Artieri CG; National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Atallah J; Department of Evolution and Ecology, University of California, Davis, California 95616, USA;, Barmina O; Department of Evolution and Ecology, University of California, Davis, California 95616, USA;, Brown JB; Department of Statistics, University of California, Berkeley, California 94720, USA;, Blankenburg KP; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Clough E; National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Dasgupta A; Clinical Trials and Outcomes Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Gubbala S; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Han Y; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Jayaseelan JC; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Kalra D; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Kim YA; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20892, USA;, Kovar CL; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Lee SL; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Li M; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Malley JD; Division of Computational Bioscience, Center For Information Technology, National Institutes of Health, Bethesda, Maryland 20814, USA;, Malone JH; National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Mathew T; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Mattiuzzo NR; National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Munidasa M; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Muzny DM; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Ongeri F; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Perales L; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Przytycka TM; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20892, USA;, Pu LL; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Robinson G; Department of Statistics, University of California, Berkeley, California 94720, USA;, Thornton RL; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Saada N; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Scherer SE; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Smith HE; National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Vinson C; National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA;, Warner CB; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Worley KC; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Wu YQ; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Zou X; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Cherbas P; Department of Biology, Indiana University, Bloomington, Indiana 47405, USA;, Kellis M; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 20139, USA;, Eisen MB; Molecular and Cell Biology, University of California, Berkeley, California 94720, USA;, Piano F; Department of Biology, New York University, New York, New York 10003, USA;, Kionte K; Department of Biology, New York University, New York, New York 10003, USA;, Fitch DH; Department of Biology, New York University, New York, New York 10003, USA;, Sternberg PW; HHMI and Division of Biology, California Institute of Technology, Pasadena, California 91125, USA;, Cutter AD; Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, M5S 3B2, Canada;, Duff MO; Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, Farmington, Connecticut 06030-6403, USA., Hoskins RA; Department of Genome Dynamics, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA;, Graveley BR; Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, Farmington, Connecticut 06030-6403, USA., Gibbs RA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;, Bickel PJ; Department of Statistics, University of California, Berkeley, California 94720, USA;, Kopp A; Department of Evolution and Ecology, University of California, Davis, California 95616, USA;, Carninci P; Technology Development Group, RIKEN Omics Science Center and RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama City, Kanagawa, Japan 230-0045;, Celniker SE; Department of Genome Dynamics, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA;, Oliver B; National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA;, Richards S; Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
Jazyk: angličtina
Zdroj: Genome research [Genome Res] 2014 Jul; Vol. 24 (7), pp. 1209-23.
DOI: 10.1101/gr.159384.113
Abstrakt: Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.
(© 2014 Chen et al.; Published by Cold Spring Harbor Laboratory Press.)
Databáze: MEDLINE