ToRQuEMaDA: tool for retrieving queried Eubacteria, metadata and dereplicating assemblies
Autor: | Marie Leleu, Denis Baurain, Frédéric Kerff, Mick Van Vlierberghe, Raphaël R. Léonard |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Alignment-free methods
Information retrieval Singularity Bioinformatics Phylum Computer science Strain (biology) Phylogenomics Genomics NCBI RefSeq Dereplication Microbiology Metadata Genome selection Scalability Medicine Metagenomics Taxonomic rank Prokaryotes Heuristics Cluster analysis Word (computer architecture) Taxonomy Genome quality |
Zdroj: | PeerJ, Vol 9, p e11348 (2021) PeerJ |
ISSN: | 2167-8359 |
Popis: | TQMD is a tool which downloads, stores and produces lists of dereplicated prokaryotic genomes. It has been developed to counter the ever-growing number of prokaryotic genomes and their uneven taxonomic distribution. It is based on word-based alignment-free methods (k-mers), an iterative single-linkage approach and a divide-and-conquer strategy to remain both efficient and scalable. We studied the performance of TQMD by verifying the influence of its parameters and heuristics on the clustering outcome. We further compared TQMD to two other dereplication tools (dRep and Assembly-Dereplicator). Our results showed that TQMD is optimized to dereplicate at high taxonomic levels (phylum/class), whereas the other dereplication tools are optimized for lower taxonomic levels (species/strain), making TQMD complementary to the existing dereplicating tools. TQMD is available at |
Databáze: | OpenAIRE |
Externí odkaz: |