Graph Theory-Based Sequence Descriptors as Remote Homology Predictors
Autor: | Reinaldo Molina-Ruiz, Evys Ancede-Gallardo, Gustavo A. de la Riva, Agostinho Antunes, Deborah Galpert, Guillermin Agüero-Chapin, Gisselle Pérez-Machado |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
Bioinformatics
Computer science Topological indices Big data lcsh:QR1-502 Sequence Homology Review alignment-free Machine learning computer.software_genre Biochemistry lcsh:Microbiology Homology (biology) 03 medical and health sciences 0302 clinical medicine big data Sequence Analysis Protein Computer Graphics Amino Acid Sequence Molecular Biology 030304 developmental biology Graphical user interface 0303 health sciences Alignment-free business.industry QSAR Computational Biology Graph theory bioinformatics Weighting 030220 oncology & carcinogenesis Graph (abstract data type) topological indices Artificial intelligence business computer |
Zdroj: | Biomolecules Biomolecules, Vol 10, Iss 1, p 26 (2019) |
ISSN: | 2218-273X |
Popis: | Indexación: Scopus. Alignment-free (AF) methodologies have increased in popularity in the last decades as alternative tools to alignment-based (AB) algorithms for performing comparative sequence analyses. They have been especially useful to detect remote homologs within the twilight zone of highly diverse gene/protein families and superfamilies. The most popular alignment-free methodologies, as well as their applications to classification problems, have been described in previous reviews. Despite a new set of graph theory-derived sequence/structural descriptors that have been gaining relevance in the detection of remote homology, they have been omitted as AF predictors when the topic is addressed. Here, we first go over the most popular AF approaches used for detecting homology signals within the twilight zone and then bring out the state-of-the-art tools encoding graph theory-derived sequence/structure descriptors and their success for identifying remote homologs. We also highlight the tendency of integrating AF features/measures with the AB ones, either into the same prediction model or by assembling the predictions from different algorithms using voting/weighting strategies, for improving the detection of remote signals. Lastly, we briefly discuss the efforts made to scale up AB and AF features/measures for the comparison of multiple genomes and proteomes. Alongside the achieved experiences in remote homology detection by both the most popular AF tools and other less known ones, we provide our own using the graphical–numerical methodologies, MARCH-INSIDE, TI2BioP, and ProtDCal. We also present a new Python-based tool (SeqDivA) with a friendly graphical user interface (GUI) for delimiting the twilight zone by using several similar criteria. https://www.mdpi.com/2218-273X/10/1/26 |
Databáze: | OpenAIRE |
Externí odkaz: |