InferIP
Autor: | Joobin Gharibshah, Konstantinos Pelechrinis, Andre Castro, Tai Ching Li, Maria Vanrell, Evangelos E. Papalexakis, Michalis Faloutsos |
---|---|
Rok vydání: | 2017 |
Předmět: |
Focus (computing)
Data collection Computer science Novelty 02 engineering and technology Blacklist World Wide Web Identification (information) 020204 information systems 0202 electrical engineering electronic engineering information engineering Feature (machine learning) Key (cryptography) 020201 artificial intelligence & image processing Hacker |
Zdroj: | ASONAM |
DOI: | 10.1145/3110025.3110055 |
Popis: | How much useful information can we extract from security forums? Many security initiatives and commercial entities are harnessing the readily public information, but they seem to focus on structured sources of information. Our goal here is to extract information from hacker forums, whose information is provided in ad hoc and unstructured ways. Here, we focus on the problem of identifying malicious IPs addresses, when these are being reported in the forums. We develop a method to automate the identification of malicious IPs with the design goal of being independent of external sources. A key novelty is that we use a matrix decomposition method to extract latent features of the behavioral information of the users, which we combine with textual information from the related posts. As key design feature, our technique can be applied to different language forums since it relies on a simple NLP solution in combination with behavioral features. In particular, our solution only needs a small number of keywords in the new language plus the user's behavior captured by specific features. We also develop a tool to automate the data collection from security forums. We collect approximately 600K posts from 3 different forums. Our method exhibits high classification accuracy, while the precision of identifying malicious IP in post is greater than 88% in all three sites. Furthermore, by applying our method, we find up to 3 times more potentially malicious IPs than compared to the reference blacklist VirusTotal. As the cyber-wars are becoming more intense, having early accesses to useful information becomes more imperative to remove the hackers first-move advantage, and our work is a solid step towards this direction. |
Databáze: | OpenAIRE |
Externí odkaz: |