NGS-μsat: bioinformatics framework supporting high throughput microsatellite genotyping from next generation sequencing platforms

Autor: Daniel D. Heath, Clare J. Venney, Ryan P. Walter, Denis Roy, Sarah J. Lehnert
Rok vydání: 2021
Předmět:
Zdroj: Conservation Genetics Resources. 13:161-173
ISSN: 1877-7260
1877-7252
DOI: 10.1007/s12686-020-01186-0
Popis: Although genetic techniques are moving toward collecting massive amounts of genome-wide data through genome-scans, microsatellite markers (µsats) still provide a simple and cost-effective method for key applications such as parentage analyses, pedigree tracking, assessing likelihoods of disease conditions and DNA fingerprinting, among others. Newer laboratory protocols using high throughput sequencing platforms can now generate µsat data more efficiently than ever before. Yet, there is a dearth of easy to use, interactive software reliably converting raw sequencing data into individual-based multi-locus µsat genotypes suitable for typical downstream analyses. We describe the development and application of NGS-µsat, an R-based software workflow capable of converting raw µsat sequence data produced using next-generation sequencing platforms into multi-locus genotypes. Because the algorithm identifies repeat motifs, it does not rely on identifying and removing extraneous sequence fragments from sequenced reads to score loci. Accordingly, the software scores ‘true’ µsat repeats and provides an accurate, and clean picture of locus information without the typical assessment ambiguity based on fragment lengths. In comparative analyses, results show that NGS-µsat leads to cleaner, more reliable genotypes that are more repeatable than those made by scoring the same data using other software based on fragment lengths. This increased reliability/reproducibility of generated data may expand the use of high throughput sequencing-based techniques to routine DNA profiling, DNA fingerprinting and parentage/pedigree analyses and revitalise the application of µsats more broadly.
Databáze: OpenAIRE