Transcriptome-scale homoeolog-specific transcript assemblies of bread wheat
Autor: | Stephan Kong, Andreas W. Schreiber, Ute Baumann, Peter Langridge, Matthew J. Hayden, Kerrie Forrest |
---|---|
Jazyk: | angličtina |
Předmět: |
lcsh:QH426-470
lcsh:Biotechnology Sequence assembly Computational biology Biology Genome DNA sequencing Contig Mapping Polyploidy Transcriptome lcsh:TP248.13-248.65 Genetics Cluster Analysis Cloud computing Triticum Wheat transcriptome Wheat genes Contig food and beverages Sequence Analysis DNA Genome project lcsh:Genetics DNA microarray Algorithms Genome Plant Research Article Biotechnology |
Zdroj: | BMC Genomics BMC Genomics, Vol 13, Iss 1, p 492 (2012) |
ISSN: | 1471-2164 |
DOI: | 10.1186/1471-2164-13-492 |
Popis: | Background Bread wheat is one of the world’s most important food crops and considerable efforts have been made to develop genomic resources for this species. This includes an on-going project by the International Wheat Genome Sequencing Consortium to assemble its large and complex genome, which is hexaploid and contains three closely related ‘homoeologous’ copies for each chromosome. This multi-national effort avoids the complications polyploidy entails for correct assembly of the genome by sequencing flow-sorted chromosome arms one at a time. Here we report on an alternate approach, a direct homoeolog-specific assembly of the expressed portion of the genome, the transcriptome. Results After assessment of the ability of various assemblers to generate homoeolog-specific assemblies, we employed a two-stage assembly process to produce a high-quality assembly of the transcriptome of hexaploid wheat from Roche-454 and Illumina GAIIx paired-end sequence reads. The assembly process made use of a rapid partitioning of expressed sequences into homoeologous clusters, followed by a parallel high-fidelity assembly of each cluster on a 1150-processor compute cloud. We assessed assembly quality through comparison to known wheat gene sequences and found that in ca. 98.5% of cases the assembly was sufficiently accurate for homoeologous triplets to be cleanly separated into either two or three separate contigs. Comparison to publicly available transcript collections suggests that the assembly covers ~75-80% of the complete transcriptome. Conclusions This work therefore describes the first homoeolog-specific sequence assembly of the wheat transcriptome and provides a reference transcriptome for future wheat research. Furthermore, our assembly methodology is transferable to other polyploid organisms. |
Databáze: | OpenAIRE |
Externí odkaz: |