High-Performance Hybrid Computing for Bioinformatic Analysis of Protein Superfamilies

Autor: Maksim V. Shegay, Vytas K. Švedas, Dmitry A. Suplatov, Nina N. Popova, Yana A. Sharapova, Vladimir V. Voevodin, Kateryna Fesko
Rok vydání: 2019
Předmět:
Zdroj: Communications in Computer and Information Science ISBN: 9783030365912
RuSCDays
Popis: Construction of a multiple alignment of proteins that implement different functions within a common structural fold of a superfamily is a valuable tool in bioinformatics, but represents a challenge. The process can be seen as a pipeline of independent sequential steps of an equivalent computational complexity each performed by a different set of algorithms. In this work the overall productivity of the corresponding Mustguseal protocol was significantly improved by selecting an appropriate optimization strategy for each step of the pipeline. This HPC-installation was used to collect and superimpose within 12 h a representative set of 299’976 sequences and structures of the fold-type I PLP-dependent enzymes what appears to be the largest alignment of a protein superfamily ever constructed. The use of hybrid acceleration strategies provided a routine access to a sequence/structure comparison of evolutionarily related proteins at a scale that would previously have been intractable to study the structure-function relationship and solve practically relevant problems, thus promoting the value of bioinformatics and HPC in protein engineering and drug discovery.
Databáze: OpenAIRE