Autor: |
MITROFANOV, SERGEY I., PANCHIN, ALEXANDER Y., SPIRIN, SERGEI A., ALEXEEVSKI, ANDREI V., PANCHIN, YURI V. |
Předmět: |
|
Zdroj: |
Journal of Bioinformatics & Computational Biology; Jun2010, Vol. 8 Issue 3, p519-534, 16p, 4 Charts |
Abstrakt: |
We studied the distribution of 1–7 bp words in a dataset that includes 139 complete eukaryotic genomes, 33 masked eukaryotic genomes and coding regions from 35 genomes. We tested different statistical models to determine over- and under-represented words. The method described by Karlin et al. has the strongest predictive power compared to other methods. Using this method we identified over- and under-represented words consistent within a large array of taxonomic groups. Some of those words have not yet been described as exclusive. For example, CGCG is over-represented in CG-deficient organisms. We also describe exceptions for widely known exclusive words, such as CG and TA. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|