Genome-based prediction of maize hybrid performance across genetic groups, testers, locations, and years

Autor:	Milena Ouzunova, Carsten Knaak, Hans-Peter Piepho, Chris-Carolin Schön, Theresa Albrecht, Hans-Jürgen Auinger, Valentin Wimmer, Joseph O. Ogutu
Rok vydání:	2014
Předmět:	Genotype Breeding program Population Single-nucleotide polymorphism Breeding Biology Polymorphism Single Nucleotide Zea mays Set (abstract data type) Statistics Genetics Plant breeding education Selection (genetic algorithm) education.field_of_study Models Genetic business.industry General Medicine Biotechnology Data set Phenotype Doubled haploidy Hybridization Genetic business Agronomy and Crop Science Genome Plant
Zdroj:	Theoretical and Applied Genetics. 127:1375-1386
ISSN:	1432-2242 0040-5752
Popis:	The calibration data for genomic prediction should represent the full genetic spectrum of a breeding program. Data heterogeneity is minimized by connecting data sources through highly related test units. One of the major challenges of genome-enabled prediction in plant breeding lies in the optimum design of the population employed in model training. With highly interconnected breeding cycles staggered in time the choice of data for model training is not straightforward. We used cross-validation and independent validation to assess the performance of genome-based prediction within and across genetic groups, testers, locations, and years. The study comprised data for 1,073 and 857 doubled haploid lines evaluated as testcrosses in 2 years. Testcrosses were phenotyped for grain dry matter yield and content and genotyped with 56,110 single nucleotide polymorphism markers. Predictive abilities strongly depended on the relatedness of the doubled haploid lines from the estimation set with those on which prediction accuracy was assessed. For scenarios with strong population heterogeneity it was advantageous to perform predictions within a priori defined genetic groups until higher connectivity through related test units was achieved. Differences between group means had a strong effect on predictive abilities obtained with both cross-validation and independent validation. Predictive abilities across subsequent cycles of selection and years were only slightly reduced compared to predictive abilities obtained with cross-validation within the same year. We conclude that the optimum data set for model training in genome-enabled prediction should represent the full genetic and environmental spectrum of the respective breeding program. Data heterogeneity can be reduced by experimental designs that maximize the connectivity between data sources by common or highly related test units.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::131ecb3204555e72449b1b75e131d44e https://doi.org/10.1007/s00122-014-2305-z Zobrazit plný text záznamu Full text from SpringerLink