An alignment-free method for detection of missing regions for phylogenetic analysis

Autor: Rubyeat Islam, Atif Rahman
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Heliyon, Vol 10, Iss 11, Pp e32227- (2024)
Druh dokumentu: article
ISSN: 2405-8440
DOI: 10.1016/j.heliyon.2024.e32227
Popis: Phylogenetic tree estimation using conventional approaches usually requires pairwise or multiple sequence alignment. However, sequence alignment has difficulties related to scalability and accuracy in case of long sequences such as whole genomes, low sequence identity, and in presence of genomic rearrangements. To address these issues, alignment-free approaches have been proposed. While these methods have demonstrated promising results, many of these lead to errors when regions are missing from the sequences of one or more species that are trivially detected in alignment-based methods. Here, we present an alignment-free method for detecting missing regions in sequences of species for which phylogeny is to be estimated. It is based on counts of k-mers and can be used to filter out k-mers belonging to regions in one species that are missing in one or more of the other species. We perform experiments with real and simulated datasets containing missing regions and find that it can successfully detect a large fraction of such k-mers and can lead to improvements in the estimated phylogenies. Our method can be used in k-mer based alignment-free phylogeny estimation methods to filter out k-mers corresponding to missing regions.
Databáze: Directory of Open Access Journals