A SERP-Mining Approach for Classification of DNS Requests

Autor: Bin Yu, Junlan Lu, Magdalini Eirinaki, Cricket Liu, Nikhil Takappa Saunshi, Aldrich Mangune
Rok vydání: 2019
Předmět:
Zdroj: IEEE BigData
DOI: 10.1109/bigdata47090.2019.9006108
Popis: DNS request classification is an area that has received a lot of attention, mostly as part of network security process, in order to classify requests into malicious and non-malicious. However, there exist several categories of web pages that even though not malicious, they belong to “borderline” categories and need to be monitored. For instance, websites selling illegal substances or weapons might be of interest for any public or private organization to monitor as outgoing traffic. In this work, we treat this as a topic classification problem. We present and evaluate a machine learning framework that takes as input a domain name (based on the respective DNS request) and outputs the content category it belongs to. We evaluate several options for feature engineering and classification to find the most appropriate setup for the specific problem domain. We also address the problem of data collection and preprocessing. While there exist several labelled datasets with malicious/non-malicious requests, a similar labelled dataset does not exist for general web content categories. We therefore propose a SERP (Search Engine Response Pages)-mining approach to collect and label an appropriate dataset. Our experimental evaluation uncovers several interesting insights and forms the basis for further work into this interesting domain.
Databáze: OpenAIRE