Bayesian variable selection in linear regression models with non-normal errors
Autor: | Gabriele Soffritti, Giuliano Galimberti, Saverio Ranciati |
---|---|
Přispěvatelé: | Ranciati, Saverio, Galimberti, Giuliano, Soffritti, Gabriele, F. Greselin, F. Mola, M. Zenga, Greselin, Francesca, Saverio Ranciati, Giuliano Galimberti, Gabriele Soffritti |
Rok vydání: | 2018 |
Předmět: |
Statistics and Probability
Gaussian mixture model g-prior MCMC algorithm median probability criterion Variables Gaussian mixture model · G-prior · MCMC algorithm · Median probability criterion Computer science media_common.quotation_subject g-prior Feature selection Mixture model 01 natural sciences 010104 statistics & probability Skewness Linear regression 0101 mathematics Statistics Probability and Uncertainty Bayesian linear regression Algorithm Selection (genetic algorithm) media_common |
Zdroj: | Statistical Methods & Applications. 28:323-358 |
ISSN: | 1613-981X 1618-2510 |
DOI: | 10.1007/s10260-018-00441-x |
Popis: | This paper addresses two crucial issues in multiple linear regression analysis: (i) error terms whose distribution is non-normal because of the presence of asymmetry of the response variable and/or data coming from heterogeneous populations; (ii) selection of the regressors that effectively contribute to explaining patterns in the observations and are relevant for predicting the dependent variable. A solution to the first issue can be obtained through an approach in which the distribution of the error terms is modelled using a finite mixture of Gaussian distributions. In this paper we use this approach to specify a Bayesian linear regression model with non-normal errors; furthermore, by embedding Bayesian variable selection techniques in the specification of the model, we simultaneously perform estimation and variable selection. These tasks are accomplished by sampling from the posterior distributions associated with the model. The performances of the proposed methodology are evaluated through the analysis of simulated datasets in comparison with other approaches. The results of an analysis based on a real dataset are also provided. The methods developed in this paper result to perform well when the distribution of the error terms is characterised by heavy tails, skewness and/or multimodality. |
Databáze: | OpenAIRE |
Externí odkaz: |