Drosophila 3' UTRs are more complex than protein-coding sequences
Autor: | Kerrie Mengersen, Jonathan M. Keith, Edward Tasker, Manjula Algama, Christopher Oldmeadow |
---|---|
Rok vydání: | 2013 |
Předmět: |
Untranslated region
010000 MATHEMATICAL SCIENCES 01 natural sciences 010104 statistics & probability Melanogaster Transversion 3' Untranslated Regions Genome Evolution Genetics 0303 health sciences Multidisciplinary Genomics Functional Genomics Regulatory sequence Physical Sciences Medicine Drosophila Sequence databases Drosophila melanogaster Multiple alignment calculation Untranslated regions Sequence Analysis Statistics (Mathematics) Research Article Genome evolution Markov Models Science Molecular Sequence Data Sequence alignment Computational biology Biology Biostatistics Genome Complexity 03 medical and health sciences Open Reading Frames Species Specificity Animals 0101 mathematics Statistical Methods Molecular Biology Techniques Sequencing Techniques Molecular Biology 060100 BIOCHEMISTRY AND CELL BIOLOGY 030304 developmental biology Base Sequence Models Genetic Computational Biology Genetic Variation Biology and Life Sciences Bayes Theorem Comparative Genomics biology.organism_classification Genome Analysis Probability Theory Sequence motif analysis GC-content Mathematics |
Zdroj: | PLoS ONE PLoS ONE, Vol 9, Iss 5, p e97336 (2014) |
ISSN: | 1932-6203 |
Popis: | The 3' UTRs of eukaryotic genes participate in a variety of post-transcriptional (and some transcriptional) regulatory interactions. Some of these interactions are well characterised, but an undetermined number remain to be discovered. While some regulatory sequences in 3' UTRs may be conserved over long evolutionary time scales, others may have only ephemeral functional significance as regulatory profiles respond to changing selective pressures. Here we propose a sensitive segmentation methodology for investigating patterns of composition and conservation in 3' UTRs based on comparison of closely related species. We describe encodings of pairwise and three-way alignments integrating information about conservation, GC content and transition/transversion ratios and apply the method to three closely related Drosophila species: D. melanogaster, D. simulans and D. yakuba. Incorporating multiple data types greatly increased the number of segment classes identified compared to similar methods based on conservation or GC content alone. We propose that the number of segments and number of types of segment identified by the method can be used as proxies for functional complexity. Our main finding is that the number of segments and segment classes identified in 3' UTRs is greater than in the same length of protein-coding sequence, suggesting greater functional complexity in 3' UTRs. There is thus a need for sustained and extensive efforts by bioinformaticians to delineate functional elements in this important genomic fraction. C code, data and results are available upon request. |
Databáze: | OpenAIRE |
Externí odkaz: |