Do Word Clues Suffice in Detecting Spai and Phishing?

Autor: L. Guiterrez, D.P. Horner, D.S. Barnes, M. Egan, Craig Martell, M. McVicker, R. Betancourt, R. Toledo, D.T. Davis, Neil C. Rowe
Rok vydání: 2007
Předmět:
Zdroj: 2007 IEEE SMC Information Assurance and Security Workshop.
DOI: 10.1109/iaw.2007.381908
Popis: Some commercial antispam and anti-phishing products prohibit email from "blacklisted" sites that they claim send spam and phishing email, while allowing email claiming to be from "whitelisted" sites they claim are known not to send it. This approach tends to unfairly discriminate against smaller and less-known sites, and would seem to be anti-competitive. An open question is whether other clues to spam and phishing would suffice to identify it. We report on experiments we have conducted to compare different clues for automated detection tools. Results show that word clues were by far the best clues for spam and phishing, although a little bit better performance could be obtained by supplementing word clues with a few others like the time of day the email was sent and inconsistency in headers. We also compared different approaches to combining clues to spam such as Bayesian reasoning, case-based reasoning, and neural networks; Bayesian reasoning performed the best. Our conclusion is that Bayesian reasoning on word clues is sufficient for antispam software and that blacklists and whitelists are unnecessary.
Databáze: OpenAIRE