A simple derivation of the distribution of pairwise local protein sequence alignment scores
Autor: | Bastien, Olivier |
---|---|
Přispěvatelé: | Laboratoire de physiologie cellulaire végétale (LPCV), Université Joseph Fourier - Grenoble 1 (UJF)-Institut National de la Recherche Agronomique (INRA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche Interdisciplinaire de Grenoble (IRIG), Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), ANR-06-MDCA-0014,PlasmoExplore,Fouille des données génomiques de Plasmodium falciparum pour prédire la fonction des gènes orphelins et identifier de nouvelles cibles thérapeutiques(2006), Université Joseph Fourier - Grenoble 1 (UJF)-Institut National de la Recherche Agronomique (INRA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Centre National de la Recherche Scientifique (CNRS), ANR PlasmoExplore,ANR PlasmoExplore, Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Joseph Fourier - Grenoble 1 (UJF)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Recherche Agronomique (INRA) |
Jazyk: | angličtina |
Rok vydání: | 2008 |
Předmět: |
alignement de séquences
0303 health sciences sequence comparison Karlin-Altshul theorem [SDV]Life Sciences [q-bio] lcsh:Evolution bioinformatics SEQUENCE PROTEIQUE Computer Science Applications reliability theory comparaison de séquences 03 medical and health sciences protein sequence 0302 clinical medicine sequence alignment lcsh:QH359-425 Genetics [SDV.BBM]Life Sciences [q-bio]/Biochemistry Molecular Biology conservation function bioinformatique Rapid Communication 030217 neurology & neurosurgery Ecology Evolution Behavior and Systematics 030304 developmental biology |
Zdroj: | Evolutionary Bioinformatics Evolutionary Bioinformatics, 2008, 4, pp.41-45 Evolutionary Bioinformatics, Vol 4, Pp 41-45 (2008) Evolutionary Bioinformatics, Vol 4 (2008) Evolutionary Bioinformatics Online (4), 41-45. (2008) Evolutionary Bioinformatics, Libertas Academica (New Zealand), 2008, 4, pp.41-45 |
ISSN: | 1176-9343 |
Popis: | Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. In the asymptotic limit of long sequences, the Karlin-Altschul model computes a P-value assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Using a simple approach combined with recent results in reliability theory, we demonstrate here that the Karlin-Altshul model can be derived with no reference to the extreme events theory. Sequences were considered as systems in which components are amino acids and having a high redundancy of Information reflected by their alignment scores. Evolution of the information shared between aligned components determined the Shared Amount of Information (SA.I.) between sequences, i.e. the score. The Gumbel distribution parameters of aligned sequences scores find here some theoretical rationale. The first is the Hazard Rate of the distribution of scores between residues and the second is the probability that two aligned residues do not lose bits of information (i.e. conserve an initial pairing score) when a mutation occurs. |
Databáze: | OpenAIRE |
Externí odkaz: |