Contributions to the study of bi-lingual Roman Urdu SMS spam filtering

Autor: Kashif Mehmood, Awais Majeed, Hammad Afzal, Hassan Latif
Rok vydání: 2015
Předmět:
Zdroj: 2015 National Software Engineering Conference (NSEC).
DOI: 10.1109/nsec.2015.7396343
Popis: With the increased usage of internet and mobile phones, number of spams has also increased in both these areas. The Spam in both these areas is an increasing threat and sometimes cause huge financial as well as data/confidentiality loss. Therefore, actions need to be taken to stop these spams on both media. This paper analyses various techniques that are currently being used in Spam filtering in the context of mobile text messages. The contents of SMS are unique in nature so some techniques might be effective while some might not be. Some of mostly used algorithms and techniques are discussed in this paper. Furthermore, we have performed automatic spam filtering using machine learning algorithms on Roman Urdu text messages and achieved an accuracy of 92.2% on a manually curated corpus of 8449 messages. The SMS corpus has also been made available for future research works.
Databáze: OpenAIRE