DarkVec
Autor: | Idilio Drago, Dario Rossi, Marco Mellia, Luca Vassio, Luca Gioacchini, Zied Ben Houidi |
---|---|
Rok vydání: | 2021 |
Předmět: |
Networks
Security and privacy Network services Network security Network management Network monitoring Word embedding Source code Network monitoring Computer science Network packet business.industry media_common.quotation_subject Darknet Network management Botnet Network services Network security Security and privacy Word2vec Networks business Host (network) Word (computer architecture) media_common Computer network |
Zdroj: | Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies. |
DOI: | 10.1145/3485983.3494863 |
Popis: | Darknets are passive probes listening to traffic reaching IP addresses that host no services. Traffic reaching them is unsolicited by nature and often induced by scanners, malicious senders and misconfigured hosts. Its peculiar nature makes it a valuable source of information to learn about malicious activities. However, the massive amount of packets and sources that reach darknets makes it hard to extract meaningful insights. In particular, multiple senders contact the darknet while performing similar and coordinated tasks, which are often commanded by common controllers (botnets, crawlers, etc.). How to automatically identify and group those senders that share similar behaviors remains an open problem. We here introduce DarkVec, a methodology to identify clusters of senders (i.e., IP addresses) engaged in similar activities on darknets. DarkVec leverages word embedding techniques (e.g., Word2Vec) to capture the co-occurrence patterns of sources hitting the darknets. We extensively test DarkVec and explore its design space in a case study using one month of darknet data. We show that with a proper definition of service, the generated embeddings can be easily used to (i) associate unknown senders' IP addresses to the correct known labels (more than 96% accuracy), and (ii) identify new attack and scan groups of previously unknown senders. We contribute DarkVec source code and datasets to the community also to stimulate the use of word embeddings to automatically learn patterns on generic traffic traces. |
Databáze: | OpenAIRE |
Externí odkaz: |