An alignment- and reference-free strategy using k-mer present pattern for population genomic analyses
Autor: | Guohui Shi, Yi Dai, Da Zhou, Mengmeng Chen, Jiaqi Zhang, Yilong Bi, Shuai Liu, Qi Wu |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2024 |
Předmět: | |
Zdroj: | Mycology, Pp 1-15 (2024) |
Druh dokumentu: | article |
ISSN: | 21501203 2150-1211 2150-1203 |
DOI: | 10.1080/21501203.2024.2358868 |
Popis: | Pangenomes are replacing single reference genomes to capture all variants within a species or clade, but their analysis predominantly leverages graph-based methods that require multiple high-quality genomes and computationally intensive multiple-genome alignments. K-mer decomposition is an alternative to graph-based pangenomes. However, how to directly use k-mers for the population genetic analyses is unknown. Here, we developed a novel strategy that uses the variants of k-mer count in the genome for population analyses. To test the effectivity of this method, we compared it directly to the SNP-based method on the analysis of population structure and genetic diversity of 267 Saccharomyces cerevisiae strains within two simulated datasets and a real sequence dataset. The population structure identified with k-mers recapitulates that obtained using SNPs, indicating the effectiveness of k-mer-based approach, and higher genetic diversity within real dataset supported k-mers contained more genetic variants. Based on k-mer frequency, we found not only SNP but also some insertion/deletion and horizontal gene transfer (HGT) fragments related to the adaptive evolution of S. cerevisiae. Our study creates a framework for the alignment- and reference-free (ARF) method in population genetic analyses, which will be more pronounced in the species with no complete genome or highly diverged species. |
Databáze: | Directory of Open Access Journals |
Externí odkaz: |