Contributions to the study of bi-lingual Roman Urdu SMS spam filtering

Autor:	Kashif Mehmood, Awais Majeed, Hammad Afzal, Hassan Latif
Rok vydání:	2015
Předmět:	business.industry Computer science Context (language use) Sms spam language.human_language Support vector machine World Wide Web Statistical classification language The Internet Confidentiality Mobile telephony Urdu business
Zdroj:	2015 National Software Engineering Conference (NSEC).
DOI:	10.1109/nsec.2015.7396343
Popis:	With the increased usage of internet and mobile phones, number of spams has also increased in both these areas. The Spam in both these areas is an increasing threat and sometimes cause huge financial as well as data/confidentiality loss. Therefore, actions need to be taken to stop these spams on both media. This paper analyses various techniques that are currently being used in Spam filtering in the context of mobile text messages. The contents of SMS are unique in nature so some techniques might be effective while some might not be. Some of mostly used algorithms and techniques are discussed in this paper. Furthermore, we have performed automatic spam filtering using machine learning algorithms on Roman Urdu text messages and achieved an accuracy of 92.2% on a manually curated corpus of 8449 messages. The SMS corpus has also been made available for future research works.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::524bb7522067081a641f64cbe88fcaf7 https://doi.org/10.1109/nsec.2015.7396343 Zobrazit plný text záznamu