A Chinese spam detector using text and image features

Autor: 楊僑友
Rok vydání: 2007
Druh dokumentu: 學位論文 ; thesis
Popis: 95
With the internet growing popular rapidly, the issue of e-mail spam becomes more and more important. In this thesis, we proposed some methods for spam mail detection by content analyzing. We take some important features for email classification. First, we take the keywords for the feature of text content of emails. With a new method for keyword selecting and nearest neighbor classifier training, we proposed a nearest neighbors classifier with evolutionary algorithm. Be due to keyword selecting, it not only can improve the accuracy of email classification via text-analyzing, but also reduces the number of features and references. For the new type of spam-image spam, we proposed a new method for image analyzing. By taking some special features of image, the spam image classifier can get high accuracy of 82%. We expect the two ways for spam mail will be able to improve the efficiency of spam mail detection.
Databáze: Networked Digital Library of Theses & Dissertations