Langevin-gradient parallel tempering for Bayesian neural learning

Autor:	Konark Jain, Rohitash Chandra, Ratneel Deo, Sally Cripps
Rok vydání:	2019
Předmět:	0209 industrial biotechnology Computer science Cognitive Neuroscience Posterior probability Bayesian probability 02 engineering and technology Bayesian inference Machine learning computer.software_genre symbols.namesake 020901 industrial engineering & automation Artificial Intelligence 0202 electrical engineering electronic engineering information engineering Point estimation Time series Uncertainty quantification Artificial neural network business.industry Sampling (statistics) Markov chain Monte Carlo Statistics::Computation Computer Science Applications ComputingMethodologies_PATTERNRECOGNITION symbols 020201 artificial intelligence & image processing Artificial intelligence Parallel tempering business computer
Zdroj:	Neurocomputing. 359:315-326
ISSN:	0925-2312
Popis:	Bayesian inference provides a rigorous approach for neural learning with knowledge representation via the posterior distribution that accounts for uncertainty quantification. Markov Chain Monte Carlo (MCMC) methods typically implement Bayesian inference by sampling from the posterior distribution. This not only provides point estimates of the weights, but the ability to propagate and quantify uncertainty in decision making. However, these techniques face challenges in convergence and scalability, particularly in settings with large datasets and neural network architectures. This paper addresses these challenges in two ways. First, parallel tempering MCMC sampling method is used to explore multiple modes of the posterior distribution and implemented in multi-core computing architecture. Second, we make within-chain sampling scheme more efficient by using Langevin gradient information for creating Metropolis–Hastings proposal distributions. We demonstrate the techniques using time series prediction and pattern classification applications. The results show that the method not only improves the computational time, but provides better decision making capabilities when compared to related methods.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::d1dad58b9d90c401613111cae4233ef5 https://doi.org/10.1016/j.neucom.2019.05.082 Zobrazit plný text záznamu Full Text from ScienceDirect