Detection and Visualization of Compositionally Similar cis-Regulatory Element Clusters in Orthologous and Coordinately Controlled Genes

Autor: Anil G. Jegga, Jerry L. Phillips, Andrew T. Pinski, Bruce J. Aronow, James W. Carman, Shawn P. Sherwood, John Pestian
Rok vydání: 2002
Předmět:
Zdroj: Genome Research. 12:1408-1417
ISSN: 1549-5469
1088-9051
DOI: 10.1101/gr.255002
Popis: Evolutionarily conserved noncoding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. However, detecting and visualizing compositionally similarcis-element clusters in the context of conserved sequences is challenging. We have explored potential solutions and developed an algorithm and visualization method that combines the results of conserved sequence analyses (BLASTZ) with those of transcription factor binding site analyses (MatInspector) (http://trafac.chmcc.org). We define hits as the density of co-occurring cis-element transcription factor (TF)-binding sites measured within a 200-bp moving average window through phylogenetically conserved regions. The results are depicted as a Regulogram, in which the hit count is plotted as a function of position within each of the two genomic regions of the aligned orthologs. Within a high-scoring region, the relative arrangement of sharedcis-elements within compositionally similar TF-binding site clusters is depicted in a Trafacgram. On the basis of analyses of several training data sets, the approach also allows for the detection of similarities in composition and relative arrangement ofcis-element clusters within nonorthologous genes, promoters, and enhancers that exhibit coordinate regulatory properties. Known functional regulatory regions of nonorthologous and less-conserved orthologous genes frequently showed cis-element shuffling, demonstrating that compositional similarity can be more sensitive than sequence similarity. These results show that combining sequence similarity with cis-element compositional similarity provides a powerful aid for the identification of potential control regions.
Databáze: OpenAIRE