Variational Bayes for High-Dimensional Linear Regression With Sparse Priors

Autor: Kolyan Ray, Botond Szabo
Přispěvatelé: Mathematics
Rok vydání: 2021
Předmět:
FOS: Computer and information sciences
Statistics and Probability
62G20 (Primary)
62G05
65K10 (secondary)

Statistics & Probability
Mathematics - Statistics Theory
Machine Learning (stat.ML)
Statistics Theory (math.ST)
Model selection
Bayesian inference
1603 Demography
Methodology (stat.ME)
Bayes' theorem
Statistics - Machine Learning
EMPIRICAL BAYES
MODEL SELECTION
ORACLE INEQUALITIES
SPARSITY
SPIKE-AND-SLAB PRIOR
VARIATIONAL BAYES

Prior probability
Linear regression
1403 Econometrics
FOS: Mathematics
SPIKE
Statistics::Methodology
Applied mathematics
NEEDLES
Statistics - Methodology
Selection (genetic algorithm)
Mathematics
Science & Technology
0104 Statistics
Spike-and-slab prior
Statistics::Computation
VARIABLE SELECTION
Oracle inequalities
Physical Sciences
SDG 1 - No Poverty
Compatibility (mechanics)
CONVERGENCE-RATES
INFERENCE
STRAW
Spike (software development)
Statistics
Probability and Uncertainty

Variational Bayes
Sparsity
POSTERIOR CONCENTRATION
Zdroj: Journal of the American Statistical Association, 117(539), 1270-1281. Taylor and Francis Ltd.
Ray, K & Szabó, B 2022, ' Variational Bayes for High-Dimensional Linear Regression With Sparse Priors ', Journal of the American Statistical Association, vol. 117, no. 539, pp. 1270-1281 . https://doi.org/10.1080/01621459.2020.1847121
ISSN: 1537-274X
0162-1459
DOI: 10.1080/01621459.2020.1847121
Popis: We study a mean-field spike and slab variational Bayes (VB) approximation to Bayesian model selection priors in sparse high-dimensional linear regression. Under compatibility conditions on the design matrix, oracle inequalities are derived for the mean-field VB approximation, implying that it converges to the sparse truth at the optimal rate and gives optimal prediction of the response vector. The empirical performance of our algorithm is studied, showing that it works comparably well as other state-of-the-art Bayesian variable selection methods. We also numerically demonstrate that the widely used coordinate-ascent variational inference (CAVI) algorithm can be highly sensitive to the parameter updating order, leading to potentially poor performance. To mitigate this, we propose a novel prioritized updating scheme that uses a data-driven updating order and performs better in simulations. The variational algorithm is implemented in the R package 'sparsevb'.
Comment: 42 pages. To appear in Journal of the American Statistical Association
Databáze: OpenAIRE