Protein languages differ depending on microorganism lifestyle.

Autor: Grzymski JJ; Division of Earth and Ecosystem Sciences, Desert Research Institute, Reno, Nevada, United States of America., Marsh AG; Center for Bioinformatics and Computational Biology, Marine Biological Sciences, University of Delaware, Lewes, Delaware, United States of America.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2014 May 14; Vol. 9 (5), pp. e96910. Date of Electronic Publication: 2014 May 14 (Print Publication: 2014).
DOI: 10.1371/journal.pone.0096910
Abstrakt: Few quantitative measures of genome architecture or organization exist to support assumptions of differences between microorganisms that are broadly defined as being free-living or pathogenic. General principles about complete proteomes exist for codon usage, amino acid biases and essential or core genes. Genome-wide shifts in amino acid usage between free-living and pathogenic microorganisms result in fundamental differences in the complexity of their respective proteomes that are size and gene content independent. These differences are evident across broad phylogenetic groups-a result of environmental factors and population genetic forces rather than phylogenetic distance. A novel comparative analysis of amino acid usage-utilizing linguistic analyses of word frequency in language and text-identified a global pattern of higher peptide word repetition in 376 free-living versus 421 pathogen genomes across broad ranges of genome size, G+C content and phylogenetic ancestry. This imprint of repetitive word usage indicates free-living microorganisms have a bias for repetitive sequence usage compared to pathogens. These findings quantify fundamental differences in microbial genomes relative to life-history function.
Databáze: MEDLINE