Infinite Scaled Dirichlet Mixture Models for Spam Filtering via Bayesian and Variational Bayes Learning
Autor: | Khalid M. Jamil Khayyat, Sami Bourouis, Hassen Sallay, Nizar Bouguila, Fahd M. Aldosari |
---|---|
Rok vydání: | 2018 |
Předmět: |
Computer science
Generalization Bayesian probability Markov chain Monte Carlo 02 engineering and technology Nonparametric bayesian inference Mixture model Dirichlet distribution Data modeling Bayes' theorem symbols.namesake ComputingMethodologies_PATTERNRECOGNITION 020204 information systems 0202 electrical engineering electronic engineering information engineering symbols 020201 artificial intelligence & image processing Algorithm |
Zdroj: | iThings/GreenCom/CPSCom/SmartData |
DOI: | 10.1109/cybermatics_2018.2018.00306 |
Popis: | Spam filtering has been the topic of extensive research in the past. Many machine learning approaches have been proposed. In this paper we propose an approach based on nonparametric Bayesian inference via an infinite scaled Dirichlet mixture model. The scaled Dirichlet can be viewed as a flexible generalization of the well-known Dirichlet distribution. Our filtering framework uses both Markov Chain Monte Carlo techniques and a variational Bayes approach for the learning the resulting model. Unlike the majority of previous approaches, that have considered only the textual content of emails, our approach takes into account the visual content (i.e. images) which is largely ignored despite the fact that it is widely used by spammers. Extensive simulations and experiments have been conducted to demonstrate the merits of our framework. |
Databáze: | OpenAIRE |
Externí odkaz: |