Sensitivity to prior specification in Bayesian genome-based prediction models
Autor: | Hans-Jürgen Auinger, Chris-Carolin Schön, Valentin Wimmer, Volker Schmid, Daniel Gianola, Christina Lehermeier, Theresa Albrecht |
---|---|
Přispěvatelé: | Plant Breeding, Technische Universität München |
Rok vydání: | 2012 |
Předmět: |
Genetic Markers
Statistics and Probability Quantitative Trait Loci Bayesian probability Breeding Biology Machine learning computer.software_genre Bayesian inference Polymorphism Single Nucleotide Sensitivity and Specificity Zea mays Linkage Disequilibrium Bayes' theorem Linear regression Prior probability Statistics Genetics Computer Simulation Hellinger distance Molecular Biology Genetic Association Studies Hyperparameter Models Genetic business.industry Linear model Bayes Theorem ddc Computational Mathematics Phenotype Linear Models Artificial intelligence business computer Algorithms Genome Plant |
Popis: | Different statistical models have been proposed for maximizing prediction accuracy in genome- based prediction of breeding values in plant and animal breeding. However, little is known about the sen- sitivity of these models with respect to prior and hyperparameter specification, because comparisons of prediction performance are mainly based on a single set of hyperparameters. In this study, we focused on Bayesian prediction methods using a standard linear regression model with marker covariates coding addi- tive effects at a large number of marker loci. By comparing different hyperparameter settings, we investigated the sensitivity of four methods frequently used in genome-based prediction (Bayesian Ridge, Bayesian Lasso, BayesA and BayesB) to specification of the prior distribution of marker effects. We used datasets simulated according to a typical maize breeding program differing in the number of markers and the number of simu- lated quantitative trait loci affecting the trait. Furthermore, we used an experimental maize dataset, com- prising 698 doubled haploid lines, each genotyped with 56110 single nucleotide polymorphism markers and phenotyped as testcrosses for the two quantitative traits grain dry matter yield and grain dry matter content. The predictive ability of the different models was assessed by five-fold cross-validation. The extent of Bayes- ian learning was quantified by calculation of the Hellinger distance between the prior and posterior densities of marker effects. Our results indicate that similar predictive abilities can be achieved with all methods, but with BayesA and BayesB hyperparameter settings had a stronger effect on prediction performance than with the other two methods. Prediction performance of BayesA and BayesB suffered substantially from a non- optimal choice of hyperparameters. |
Databáze: | OpenAIRE |
Externí odkaz: |