Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
Autor: | Zhang, Ruqi, Li, Chunyuan, Zhang, Jianyi, Chen, Changyou, Wilson, Andrew Gordon |
---|---|
Rok vydání: | 2019 |
Předmět: |
Methodology (stat.ME)
FOS: Computer and information sciences Computer Science - Machine Learning ComputingMethodologies_PATTERNRECOGNITION Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Statistics - Machine Learning Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition Machine Learning (stat.ML) Statistics - Methodology Statistics::Computation Machine Learning (cs.LG) |
DOI: | 10.48550/arxiv.1902.03932 |
Popis: | The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We also prove non-asymptotic convergence of our proposed algorithm. Moreover, we provide extensive experimental results, including ImageNet, to demonstrate the scalability and effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks. Comment: Published at ICLR 2020 |
Databáze: | OpenAIRE |
Externí odkaz: |