Do Word Clues Suffice in Detecting Spam and Phishing?

Autor: Rowe, Neil C., Barnes, David S., McVicker, Michael, Egan, Melissa, David, Duane T., Guiterrez, Louis, Martell, Craig H.
Rok vydání: 2007
Předmět:
Popis: This paper appeared in the Proceedings of the 8th IEEE Workshop on Information Assurance, West Point, NY, June 2007. Some commercial antispam and anti-phishing products prohibit email from “blacklisted” sites that they claim send spam and phishing email, while allowing email claiming to be from “whitelisted” sites they claim are known not to send it. This approach tends to unfairly discriminate against smaller and less-known sites, and would seem to be anti-competitive. An open question is whether other clues to spam and phishing would suffice to identify it. We report on experiments we have conducted to compare different clues for automated detection tools. Results show that word clues were by far the best clues for spam and phishing, although a little bit better performance could be obtained by supplementing word clues with a few others like the time of day the email was sent and inconsistency in headers. We also compared different approaches to combining clues to spam such as Bayesian reasoning, case-based reasoning, and neural networks; Bayesian reasoning performed the best. Our conclusion is that Bayesian reasoning on word clues is sufficient for antispam software and that blacklists and whitelists are unnecessary. supported in part by the National Science Foundation under the Cyber Trust Program Approved for public release; distribution is unlimited.
Databáze: OpenAIRE