Dealing with paralogy in RADseq data: in silico detection and single nucleotide polymorphism validation in Robinia pseudoacacia L
Autor: | Olivier De Thier, Ludivine Lassois, Frédéric Gévaudant, Stéphanie Mariette, Cindy Verdu, Yec’han Laizet, Adline Delcamp, Annabel J. Porté, Erwan Guichoux, Samuel Quevauvillers, Philippe Lejeune, Arnaud Monty |
---|---|
Přispěvatelé: | Université de Liège, Biodiversité, Gènes & Communautés (BioGeCo), Institut National de la Recherche Agronomique (INRA)-Université de Bordeaux (UB), Biologie du fruit et pathologie (BFP), Université Bordeaux Segalen - Bordeaux 2-Institut National de la Recherche Agronomique (INRA)-Université Sciences et Technologies - Bordeaux 1, Biodiversity and Landscape Unit, Gembloux Agro-Bio Tech |
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
polymorphisme nucléotidique simple (SNP)
0301 basic medicine [SDV]Life Sciences [q-bio] Population genetics Single-nucleotide polymorphism Computational biology Biology robinia pseudoacacia depth of coverage Genome 03 medical and health sciences Genotype [SDV.BV]Life Sciences [q-bio]/Vegetal Biology SNP Allele Genotyping Ecology Evolution Behavior and Systematics Nature and Landscape Conservation Genetics Genetic diversity black locust putative paralogy filtering restriction site-associated DNA sequencing Ecology génome gène polymorphe 15. Life on land 030104 developmental biology diversité génétique [SDE]Environmental Sciences |
Zdroj: | Ecology and Evolution 20 (6), 7323-7333. (2016) Ecology and Evolution Ecology and Evolution, Wiley Open Access, 2016, 6 (20), pp.7323-7333. ⟨10.1002/ece3.2466⟩ |
ISSN: | 2045-7758 |
DOI: | 10.1002/ece3.2466⟩ |
Popis: | International audience; The RADseq technology allows researchers to efficiently develop thousands of polymorphic loci across multiple individuals with little or no prior information on the genome. However, many questions remain about the biases inherent to this technology. Notably, sequence misalignments arising from paralogy may affect the development of single nucleotide polymorphism (SNP) markers and the estimation of genetic diversity. We evaluated the impact of putative paralog loci on genetic diversity estimation during the development of SNPs from a RADseq dataset for the nonmodel tree species Robinia pseudoacacia L. We sequenced nine genotypes and analyzed the frequency of putative paralogous RAD loci as a function of both the depth of coverage and the mismatch threshold allowed between loci. Putative paralogy was detected in a very variable number of loci, from 1% to more than 20%, with the depth of coverage having a major influence on the result. Putative paralogy artificially increased the observed degree of polymorphism and resulting estimates of diversity. The choice of the depth of coverage also affected diversity estimation and SNP validation: A low threshold decreased the chances of detecting minor alleles while a high threshold increased allelic dropout. SNP validation was better for the low threshold (4×) than for the high threshold (18×) we tested. Using the strategy developed here, we were able to validate more than 80% of the SNPs tested by means of individual genotyping, resulting in a readily usable set of 330 SNPs, suitable for use in population genetics applications. |
Databáze: | OpenAIRE |
Externí odkaz: |