Langevin-gradient parallel tempering for Bayesian neural learning
Autor: | Konark Jain, Rohitash Chandra, Ratneel Deo, Sally Cripps |
---|---|
Rok vydání: | 2019 |
Předmět: |
0209 industrial biotechnology
Computer science Cognitive Neuroscience Posterior probability Bayesian probability 02 engineering and technology Bayesian inference Machine learning computer.software_genre symbols.namesake 020901 industrial engineering & automation Artificial Intelligence 0202 electrical engineering electronic engineering information engineering Point estimation Time series Uncertainty quantification Artificial neural network business.industry Sampling (statistics) Markov chain Monte Carlo Statistics::Computation Computer Science Applications ComputingMethodologies_PATTERNRECOGNITION symbols 020201 artificial intelligence & image processing Artificial intelligence Parallel tempering business computer |
Zdroj: | Neurocomputing. 359:315-326 |
ISSN: | 0925-2312 |
Popis: | Bayesian inference provides a rigorous approach for neural learning with knowledge representation via the posterior distribution that accounts for uncertainty quantification. Markov Chain Monte Carlo (MCMC) methods typically implement Bayesian inference by sampling from the posterior distribution. This not only provides point estimates of the weights, but the ability to propagate and quantify uncertainty in decision making. However, these techniques face challenges in convergence and scalability, particularly in settings with large datasets and neural network architectures. This paper addresses these challenges in two ways. First, parallel tempering MCMC sampling method is used to explore multiple modes of the posterior distribution and implemented in multi-core computing architecture. Second, we make within-chain sampling scheme more efficient by using Langevin gradient information for creating Metropolis–Hastings proposal distributions. We demonstrate the techniques using time series prediction and pattern classification applications. The results show that the method not only improves the computational time, but provides better decision making capabilities when compared to related methods. |
Databáze: | OpenAIRE |
Externí odkaz: |