Bayesian variable selection in linear regression models with non-normal errors

Autor:	Gabriele Soffritti, Giuliano Galimberti, Saverio Ranciati
Přispěvatelé:	Ranciati, Saverio, Galimberti, Giuliano, Soffritti, Gabriele, F. Greselin, F. Mola, M. Zenga, Greselin, Francesca, Saverio Ranciati, Giuliano Galimberti, Gabriele Soffritti
Rok vydání:	2018
Předmět:	Statistics and Probability Gaussian mixture model g-prior MCMC algorithm median probability criterion Variables Gaussian mixture model · G-prior · MCMC algorithm · Median probability criterion Computer science media_common.quotation_subject g-prior Feature selection Mixture model 01 natural sciences 010104 statistics & probability Skewness Linear regression 0101 mathematics Statistics Probability and Uncertainty Bayesian linear regression Algorithm Selection (genetic algorithm) media_common
Zdroj:	Statistical Methods & Applications. 28:323-358
ISSN:	1613-981X 1618-2510
DOI:	10.1007/s10260-018-00441-x
Popis:	This paper addresses two crucial issues in multiple linear regression analysis: (i) error terms whose distribution is non-normal because of the presence of asymmetry of the response variable and/or data coming from heterogeneous populations; (ii) selection of the regressors that effectively contribute to explaining patterns in the observations and are relevant for predicting the dependent variable. A solution to the first issue can be obtained through an approach in which the distribution of the error terms is modelled using a finite mixture of Gaussian distributions. In this paper we use this approach to specify a Bayesian linear regression model with non-normal errors; furthermore, by embedding Bayesian variable selection techniques in the specification of the model, we simultaneously perform estimation and variable selection. These tasks are accomplished by sampling from the posterior distributions associated with the model. The performances of the proposed methodology are evaluated through the analysis of simulated datasets in comparison with other approaches. The results of an analysis based on a real dataset are also provided. The methods developed in this paper result to perform well when the distribution of the error terms is characterised by heavy tails, skewness and/or multimodality.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ac31e45258924f0754ea0534a2041603 https://doi.org/10.1007/s10260-018-00441-x Zobrazit plný text záznamu Full text from SpringerLink