An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations.

Autor: Clavijo BJ; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Venturini L; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Schudoma C; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Accinelli GG; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Kaithakottil G; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Wright J; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Borrill P; John Innes Centre, Norwich, NR4 7UH, United Kingdom., Kettleborough G; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Heavens D; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Chapman H; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Lipscombe J; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Barker T; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Lu FH; John Innes Centre, Norwich, NR4 7UH, United Kingdom., McKenzie N; John Innes Centre, Norwich, NR4 7UH, United Kingdom., Raats D; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Ramirez-Gonzalez RH; Earlham Institute, Norwich, NR4 7UZ, United Kingdom.; John Innes Centre, Norwich, NR4 7UH, United Kingdom., Coince A; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Peel N; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Percival-Alwyn L; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Duncan O; ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley Western Australia 6009, Australia., Trösch J; ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley Western Australia 6009, Australia., Yu G; John Innes Centre, Norwich, NR4 7UH, United Kingdom., Bolser DM; EMBL European Bioinformatics Institute, Hinxton, CB10 1SD, United Kingdom., Namaati G; EMBL European Bioinformatics Institute, Hinxton, CB10 1SD, United Kingdom., Kerhornou A; EMBL European Bioinformatics Institute, Hinxton, CB10 1SD, United Kingdom., Spannagl M; Plant Genome and Systems Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany., Gundlach H; Plant Genome and Systems Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany., Haberer G; Plant Genome and Systems Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany., Davey RP; Earlham Institute, Norwich, NR4 7UZ, United Kingdom.; University of East Anglia, Norwich, NR4 7TJ, United Kingdom., Fosker C; Earlham Institute, Norwich, NR4 7UZ, United Kingdom., Palma FD; Earlham Institute, Norwich, NR4 7UZ, United Kingdom.; University of East Anglia, Norwich, NR4 7TJ, United Kingdom., Phillips AL; Rothamsted Research, Harpenden, AL5 2JQ, United Kingdom., Millar AH; ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley Western Australia 6009, Australia., Kersey PJ; EMBL European Bioinformatics Institute, Hinxton, CB10 1SD, United Kingdom., Uauy C; John Innes Centre, Norwich, NR4 7UH, United Kingdom., Krasileva KV; Earlham Institute, Norwich, NR4 7UZ, United Kingdom.; University of East Anglia, Norwich, NR4 7TJ, United Kingdom.; The Sainsbury Laboratory, Norwich, NR4 7UH, United Kingdom., Swarbreck D; Earlham Institute, Norwich, NR4 7UZ, United Kingdom.; University of East Anglia, Norwich, NR4 7TJ, United Kingdom., Bevan MW; John Innes Centre, Norwich, NR4 7UH, United Kingdom., Clark MD; Earlham Institute, Norwich, NR4 7UZ, United Kingdom.; University of East Anglia, Norwich, NR4 7TJ, United Kingdom.
Jazyk: angličtina
Zdroj: Genome research [Genome Res] 2017 May; Vol. 27 (5), pp. 885-896.
DOI: 10.1101/gr.217117.116
Abstrakt: Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.
(© 2017 Clavijo et al.; Published by Cold Spring Harbor Laboratory Press.)
Databáze: MEDLINE