Construction of non-symmetric substitution matrices derived from proteomes with biased amino acid distributions

Autor: Sylvaine Roy, Eric Maréchal, Olivier Bastien
Přispěvatelé: Laboratoire de physiologie cellulaire végétale (LPCV), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Joseph Fourier - Grenoble 1 (UJF)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Recherche Agronomique (INRA), Université Joseph Fourier - Grenoble 1 (UJF)-Institut National de la Recherche Agronomique (INRA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche Interdisciplinaire de Grenoble (IRIG), Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)
Jazyk: angličtina
Rok vydání: 2005
Předmět:
Zdroj: Comptes Rendus Biologies
Comptes Rendus Biologies, Elsevier, 2005, 328, pp.445-453. ⟨10.1016/j.crvi.2005.02.002⟩
Comptes Rendus Biologies, 2005, 328, pp.445-453. ⟨10.1016/j.crvi.2005.02.002⟩
DOI: 10.1016/j.crvi.2005.02.002⟩
Popis: Automatic comparison of compositionally biased genomes, such as that of the malarial causative agent Plasmodium falciparum (82% adenosine + thymidine), with genomes of average composition, is currently limited. Indeed, popular tools such as BLAST require that amino acid distributions be similar in aligned sequences. However, the P. falciparum genome is so biased that six amino acids account for more than 50% of the protein composition. One reason for the comparison methods failure lies in the compositional difference between the query and the subject proteomes, which is not taken into account in the amino acid substitution matrices. This paper introduces a method to derive substitution matrices, in particular BLOSUM 62, in the frame of the information theory. It allows the construction of non-symmetrical matrices, taking into account the non-symmetric amino acid distributions. The dirAtPf family of matrices allowing the comparison of P. falciparum and A. thaliana is given as an example. This paper further provides an analysis of the obtained matrices in the frame of the information theory, supporting the discrimination advantage they bring. To cite this article: O. Bastien et al., C. R. Biologies 328 (2005).
Databáze: OpenAIRE