Genome survey sequencing of Dioscorea zingiberensis

Autor: Zhezhi Wang, Bin Li, Shuchao Feng, Wen Zhou, Yuanchu Liu, Lin Li, Wen Ma
Rok vydání: 2018
Předmět:
Zdroj: Genome. 61:567-574
ISSN: 1480-3321
0831-2796
DOI: 10.1139/gen-2018-0011
Popis: Dioscorea zingiberensis (Dioscoreceae) is the main plant source of diosgenin (steroidal sapogenins), the precursor for the production of steroid hormones in the pharmaceutical industry. Despite its large economic value, genomic information of the genus Dioscorea is currently unavailable. Here, we present an initial survey of the D. zingiberensis genome performed by next-generation sequencing technology together with a genome size investigation inferred by flow cytometry. The whole genome survey of D. zingiberensis generated 31.48 Gb of sequence data with approximately 78.70× coverage. The estimated genome size is 800 Mb, with a high level of heterozygosity based on K-mer analysis. These reads were assembled into 334 288 contigs with a N50 length of 1079 bp, which were further assembled into 92 163 scaffolds with a total length of 173.46 Mb. A total of 4935 genes, 81 tRNAs, 69 rRNAs, and 661 miRNAs were predicted by the genome analysis, and 263 484 repeated sequences were obtained with 419 372 simple sequence repeats (SSRs). Among these SSRs, the mononucleotide repeat type was the most abundant (up to 54.60% of the total SSRs), followed by the dinucleotide (29.60%), trinucleotide (11.37%), tetranucleotide (3.53%), pentanucleotide (0.65%), and hexanucleotide (0.25%) repeat types. The 1C-value of D. zingiberensis was calibrated against Salvia miltiorrhiza and calculated as 0.87 pg (851 Mb) by flow cytometry, which was very close to the result of the genome survey. This is the first report of genome-wide characterization within this taxon.
Databáze: OpenAIRE