Data Mining for Expressivity of Recombinant Protein Expression

Autor: Atsushi Isoai, Satoshi Kira, Masayuki Yamamura
Rok vydání: 2006
Předmět:
Zdroj: Transactions of the Japanese Society for Artificial Intelligence. 21:9-19
ISSN: 1346-8030
1346-0714
Popis: We analyzed the expressivity of recombinant proteins by using data mining methods. The expression technique of recombinant protein is a key step towards elucidating the functions of genes discovered through genomic sequence projects. We have studied the productive efficiency of recombinant proteins in fission yeast, Schizosaccharomyces pombe (S.pombe), by mining the expression results. We gathered 57 proteins whose expression levels were known roughly in the host. Correlation analysis, principal component analysis and decision tree analysis were applied to these expression data. Analysis featuring codon usage and amino acid composition clarified that the amino acid composition affected to the expression levels of a recombinant protein strongly than the effect of codon usage. Furthermore, analysis of amino acid composition showed that protein solubility and the metabolism cost of amino acids correlated with a protein expressivity. Codon usage was often interesting in the field of recombinant expressions. However, our analysis found the weak correlation codon features with expressivities. These results indicated that ready-made indices of codon bias were irrelevant ones for modeling the expressivities of recombinant proteins. Our data driven approach was an easy and powerful method to improve recombinant protein expression, and this approach should be concentrated attention with the huge amount of expression data accumulating through the post-genome era.
Databáze: OpenAIRE