Analysis of Naïve Bayes Algorithm for Email Spam Filtering across Multiple Datasets

Autor: Fitriah, Nurul, Wahid, Norfaradilla, Kasim, Shahreen, Hafit, Hanayanti
Zdroj: IOP Conference Series: Materials Science and Engineering; August 2017, Vol. 226 Issue: 1 p012091-012091, 1p
Abstrakt: E-mail spam continues to become a problem on the Internet. Spammed e-mail may contain many copies of the same message, commercial advertisement or other irrelevant posts like pornographic content. In previous research, different filtering techniques are used to detect these e-mails such as using Random Forest, Naive Bayesian, Support Vector Machine (SVM) and Neutral Network. In this research, we test Naive Bayes algorithm for e-mail spam filtering on two datasets and test its performance, i.e., Spam Data and SPAMBASE datasets [8]. The performance of the datasets is evaluated based on their accuracy, recall, precision and F-measure. Our research use WEKA tool for the evaluation of Naive Bayes algorithm for e-mail spam filtering on both datasets. The result shows that the type of email and the number of instances of the dataset has an influence towards the performance of Naive Bayes.
Databáze: Supplemental Index