Benchmarking full version of GureKDDCup, UNSW-NB15, and CIDDS-001 NIDS datasets using rolling-origin resampling

Autor: Yee Jian Chew, Nicholas Ming Ze Lee, Ying Han Pang, Shih Yin Ooi, Kok-Seng Wong
Rok vydání: 2021
Předmět:
Zdroj: Information Security Journal: A Global Perspective. 31:544-565
ISSN: 1939-3547
1939-3555
Popis: Network intrusion detection system (NIDS) is a system that analyses network traffic to flag malicious traffic or suspicious activities. Several recent NIDS datasets have been published, however, the lack of baseline experimental results on the full version of datasets had made it difficult for researchers to perform benchmarking. As the train-test distribution of the datasets has yet to be pre-defined by the creators, this further obstruct the researchers to compare the performance unbiasedly across each of the machine classifiers. Moreover, cross-validation resampling scheme have also been addressed in the literatures to be inappropriate in the domain of NIDS. Thus, rolling-origin – a standard resampling technique which is also known as a common cross-validation scheme in the forecasting domain is employed to allocate the training and testing distributions. In this paper, rigorous experiments are conducted on the full version of the three recent NIDS datasets: GureKDDCup, UNSW-NB15, and CIDDS-001. While the datasets chosen might not be the latest available datasets, we have selected them as they include the essential IP address fields which are usually missing or removed due to some sort of privacy concerns. To deliver the baseline empirical results, 10 well-known classifiers from Weka are utilized.
Databáze: OpenAIRE