VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset

Autor: Joanna Augustynowicz, Marek Szklarczyk, Wojciech Wesołowski, Beata Domnicz
Rok vydání: 2021
Předmět:
0106 biological sciences
0301 basic medicine
Heredity
Single Nucleotide Polymorphisms
Datasets as Topic
Artificial Gene Amplification and Extension
Polymerase Chain Reaction
01 natural sciences
Homozygosity
Cleaved amplified polymorphic sequence
Biology (General)
education.field_of_study
Heterozygosity
Ecology
Software Engineering
High-Throughput Nucleotide Sequencing
Genomics
Genetic Mapping
Restriction site
Molecular Diagnostic Techniques
Computational Theory and Mathematics
Modeling and Simulation
Engineering and Technology
Research Article
Genetic Markers
Computer and Information Sciences
Genotype
QH301-705.5
Population
Variant Genotypes
Computational biology
Biology
Research and Analysis Methods
DNA sequencing
Computer Software
03 medical and health sciences
Cellular and Molecular Neuroscience
Genetics
Molecular Biology Techniques
education
Indel
Molecular Biology
Alleles
Ecology
Evolution
Behavior and Systematics

Biology and Life Sciences
Molecular diagnostics
Restriction enzyme
030104 developmental biology
Genetic Loci
Software
010606 plant biology & botany
Zdroj: PLoS Computational Biology, Vol 17, Iss 5, p e1008980 (2021)
PLoS Computational Biology
ISSN: 1553-7358
Popis: Next-generation sequencing (NGS) is a powerful tool for massive detection of DNA sequence variants such as single nucleotide polymorphisms (SNPs), multi-nucleotide polymorphisms (MNPs) and insertions/deletions (indels). For routine screening of numerous samples, these variants are often converted into cleaved amplified polymorphic sequence (CAPS) markers which are based on the presence versus absence of restriction sites within PCR products. Current computational tools for SNP to CAPS conversion are limited and usually infeasible to use for large datasets as those generated with NGS. Moreover, there is no available tool for massive conversion of MNPs and indels into CAPS markers. Here, we present VCF2CAPS–a new software for identification of restriction endonucleases that recognize SNP/MNP/indel-containing sequences from NGS experiments. Additionally, the program contains filtration utilities not available in other SNP to CAPS converters–selection of markers with a single polymorphic cut site within a user-specified sequence length, and selection of markers that differentiate up to three user-defined groups of individuals from the analyzed population. Performance of VCF2CAPS was tested on a thoroughly analyzed dataset from a genotyping-by-sequencing (GBS) experiment. A selection of CAPS markers picked by the program was subjected to experimental verification. CAPS markers, also referred to as PCR-RFLPs, belong to basic tools exploited in plant, animal and human genetics. Our new software–VCF2CAPS–fills the gap in the current inventory of genetic software by high-throughput CAPS marker design from next-generation sequencing (NGS) data. The program should be of interest to geneticists involved in molecular diagnostics. In this paper we show a successful exemplary application of VCF2CAPS and we believe that its usefulness is guaranteed by the growing availability of NGS services.
Databáze: OpenAIRE