Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures
Autor: | Mohamed Al Mashrgy, Sami Bourouis, Nizar Bouguila, Hassen Sallay, Fahd M. Aldosari, Faisal R. Al-Osaimi |
---|---|
Rok vydání: | 2018 |
Předmět: |
0209 industrial biotechnology
Computer science Gaussian Model selection Bayesian probability Statistical model Feature selection Markov chain Monte Carlo 02 engineering and technology Density estimation Reversible-jump Markov chain Monte Carlo Mixture model Bayesian inference Dirichlet distribution Theoretical Computer Science symbols.namesake 020901 industrial engineering & automation 0202 electrical engineering electronic engineering information engineering symbols 020201 artificial intelligence & image processing Geometry and Topology Cluster analysis Algorithm Software Parametric statistics |
Zdroj: | Soft Computing. 23:5799-5813 |
ISSN: | 1433-7479 1432-7643 |
DOI: | 10.1007/s00500-018-3244-4 |
Popis: | The goal of constructing models from examples has been approached from different perspectives. Statistical methods have been widely used and proved effective in generating accurate models. Finite Gaussian mixture models have been widely used to describe a wide variety of random phenomena and have played a prominent role in many attempts to develop expressive statistical models in machine learning. However, their effectiveness is limited to applications where underlying modeling assumptions (e.g., the per-components densities are Gaussian) are reasonably satisfied. Thus, much research efforts have been devoted to developing better alternatives. In this paper, we focus on constructing statistical models from positive vectors (i.e., vectors whose elements are strictly greater than zero) for which the generalized inverted Dirichlet (GID) mixture has been shown to be a flexible and powerful parametric framework. In particular, we propose a Bayesian density estimation method based upon mixtures of GIDs. The consideration of Bayesian learning is interesting in several respects. It allows to take uncertainty into account by introducing prior information about the parameters, it allows simultaneous parameters estimation and model selection, and it allows to overcome learning problems related to over- or under-fitting. Indeed, we develop a reversible jump Markov Chain Monte Carlo sampler for GID mixtures that we apply for simultaneous clustering and feature selection in the context of some challenging real-world applications concerning scene classification, action recognition, and video forgery detection. |
Databáze: | OpenAIRE |
Externí odkaz: |