Prominent use of distal 5’ transcription start sites and discovery of a large number of additional exons in ENCODE regions
Autor: | France Denoeud, Philipp Kapranov, Catherine Ucla, Adam Frankish, Robert Castelo, Jorg Drenkow, Julien Lagarde, Tyler Alioto, Caroline Manzano, Jacqueline Chrast, Sujit Dike, Carine Wyss, Charlotte N. Henrichsen, Nancy Holroyd, Mark C. Dickson, Ruth Taylor, Zahra Hance, Sylvain Foissac, Richard M. Myers, Jane Rogers, Tim Hubbard, Jennifer Harrow, Roderic Guigó, Thomas R. Gingeras, Stylianos E. Antonarakis, Alexandre Reymond |
---|---|
Přispěvatelé: | Chercheur indépendant, Laboratoire de Génétique Cellulaire (LGC), Ecole Nationale Vétérinaire de Toulouse (ENVT), Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Institut National de la Recherche Agronomique (INRA), Institut National de la Recherche Agronomique (INRA)-Ecole Nationale Vétérinaire de Toulouse (ENVT), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées |
Jazyk: | angličtina |
Rok vydání: | 2007 |
Předmět: |
DNA
Complementary Transcription Genetic [SDV]Life Sciences [q-bio] Quantitative Trait Loci Locus (genetics) Biology ENCODE Genoma humà Genome Article 03 medical and health sciences Open Reading Frames 0302 clinical medicine Rapid amplification of cDNA ends Human Genome Project Genetics Humans ORFS Promoter Regions Genetic Gene Genetics (clinical) 030304 developmental biology ddc:616 0303 health sciences Tiling array Genome Human Chromosome Mapping DNA Complementary/genetics Exons 030220 oncology & carcinogenesis Factors de transcripció Human genome Transcription Genetic/ physiology |
Zdroj: | Genome Research, Vol. 17, No 6 (2007) pp. 746-759 Genome Research Genome Research, Cold Spring Harbor Laboratory Press, 2007, 17 (6), pp.746-759. ⟨10.1101/gr.5660607⟩ Recercat. Dipósit de la Recerca de Catalunya instname |
ISSN: | 1088-9051 1549-5469 |
Popis: | Notice à Reprendre Sur les Auteurs; This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations. |
Databáze: | OpenAIRE |
Externí odkaz: |