Infinite Scaled Dirichlet Mixture Models for Spam Filtering via Bayesian and Variational Bayes Learning

Autor: Khalid M. Jamil Khayyat, Sami Bourouis, Hassen Sallay, Nizar Bouguila, Fahd M. Aldosari
Rok vydání: 2018
Předmět:
Zdroj: iThings/GreenCom/CPSCom/SmartData
DOI: 10.1109/cybermatics_2018.2018.00306
Popis: Spam filtering has been the topic of extensive research in the past. Many machine learning approaches have been proposed. In this paper we propose an approach based on nonparametric Bayesian inference via an infinite scaled Dirichlet mixture model. The scaled Dirichlet can be viewed as a flexible generalization of the well-known Dirichlet distribution. Our filtering framework uses both Markov Chain Monte Carlo techniques and a variational Bayes approach for the learning the resulting model. Unlike the majority of previous approaches, that have considered only the textual content of emails, our approach takes into account the visual content (i.e. images) which is largely ignored despite the fact that it is widely used by spammers. Extensive simulations and experiments have been conducted to demonstrate the merits of our framework.
Databáze: OpenAIRE