Popis: |
With the rapid growth of microblogging sites like Twitter, a wide range of spam activities are evolving at an equal pace. Detecting spam is a major problem in these days in social network sites. This paper explores spam URLs detection in Twitter by providing malicious behavior categories, detection evasion tactics, features used for detection, detection techniques and their limitations (if any). We further investigate the best performance exists by machine learning classification based on various published features. Therefore, we utilized four classifiers on a Twitter accounts dataset of 10713 users labeled 5358 benign and 5355 spam with 17 robust features, the feature are content-based and user-based. Our result shows that among the four classifiers the Random Forest classifier with hybrid-based features methods produce the highest evaluation with 96.4% accuracy, while J48 classifiers achieved a slightly different accuracy rate of 94.5%. |