An Approximate de Bruijn Graph Approach to Multiple Local Alignment and Motif Discovery in Protein Sequences

Autor: Mehmet Dalkilic, Haixu Tang, Sun Kim, Rupali P Patwardhan
Rok vydání: 2006
Předmět:
Zdroj: Data Mining and Bioinformatics ISBN: 9783540689706
VDMB
DOI: 10.1007/11960669_14
Popis: Motif discovery is an important problem in protein sequence analysis. Computationally, it can be viewed as an application of the more general multiple local alignment problem, which often encounters the difficulty of computer time when aligning many sequences. We introduce a new algorithm for multiple local alignment for protein sequences, based on the de Bruijn graph approach first proposed by Zhang and Waterman for aligning DNA sequence. We generalize their approach to aligning protein sequences by building an approximate de Bruijn graph to allow gluing similar but not identical amino acids. We implement this algorithm and test it on motif discovery of 100 sets of protein sequences. The results show that our method achieved comparable results as other popular motif discovery programs, while offering advantages in terms of speed.
Databáze: OpenAIRE