Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm

Autor: Diyari Jalal Mussa, Noor Ghazi M. Jameel
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Zdroj: Kurdistan Journal of Applied Research, Vol 4, Iss 2 (2019)
Druh dokumentu: article
ISSN: 2411-7684
2411-7706
DOI: 10.24017/science.2019.2.11
Popis: In recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confidential information. Besides spasm number of hazardous actions, there is a limited number of spam filtering software. According to this paper, XGBoost algorithm used for handling SMS spam detection problem. Number of structural features was collected from previous studies. 15 structural features were extracted from Tiago’s dataset, which is the most frequently used dataset by researchers. For selecting the optimal relevant features, two different types of wrapper feature selection algorithms were used in order to reduce and select best relevant features. The accuracy and performance obtained by the selected features via sequential backward selection method was better comparing to sequential forward selection method. The extracted nine optimal features can be a good representation of a spam SMS message. Additionally, the classification accuracy obtained by the proposed method using nine optimal features with XGBoost algorithm is 98.64 using 10-fold cross validation.
Databáze: Directory of Open Access Journals