Simple and Efficient Pattern Matching Algorithms for Biological Sequences
Autor: | Montassir Hadi, Mahmoud Naghibzadeh, Peyman Neamatollahi |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
0301 basic medicine
Speedup General Computer Science Computer science Bioinformatics string matching 0206 medical engineering DNA sequence Biological database 02 engineering and technology String searching algorithm 03 medical and health sciences chemistry.chemical_compound General Materials Science Pattern matching Biological data General Engineering exact algorithm 030104 developmental biology Exact algorithm chemistry lcsh:Electrical engineering. Electronics. Nuclear engineering Algorithm frequent pattern lcsh:TK1-9971 020602 bioinformatics DNA Word (computer architecture) |
Zdroj: | IEEE Access, Vol 8, Pp 23838-23846 (2020) |
ISSN: | 2169-3536 |
Popis: | The remarkable growth of biological data is a motivation to accelerate the discovery of solutions in many domains of computational bioinformatics. In different phases of the computational pipelines, pattern matching is a very practical operation. For example, pattern matching enables users to find the locations of particular DNA subsequences in a database or DNA sequence. Furthermore, in these expanding biological databases, some patterns are updated over time. To perform faster searches, high-speed pattern matching algorithms are needed. The present paper introduces three pattern matching algorithms that are specially formulated to speed up searches on large DNA sequences. The proposed algorithms raise performance by utilizing word processing (in place of the character processing presented in previous works) and also by searching the least frequent word of the pattern in the sequence. In terms of time cost, the experimental results demonstrate the superiority of the presented algorithms over the other simulated algorithms. |
Databáze: | OpenAIRE |
Externí odkaz: |