Benchmarking full version of GureKDDCup, UNSW-NB15, and CIDDS-001 NIDS datasets using rolling-origin resampling
Autor: | Yee Jian Chew, Nicholas Ming Ze Lee, Ying Han Pang, Shih Yin Ooi, Kok-Seng Wong |
---|---|
Rok vydání: | 2021 |
Předmět: |
Scheme (programming language)
Information Systems and Management Computer science business.industry Benchmarking Machine learning computer.software_genre Computer Science Applications Domain (software engineering) Resampling sort Artificial intelligence Network intrusion detection Baseline (configuration management) business computer Software Ip address computer.programming_language |
Zdroj: | Information Security Journal: A Global Perspective. 31:544-565 |
ISSN: | 1939-3547 1939-3555 |
Popis: | Network intrusion detection system (NIDS) is a system that analyses network traffic to flag malicious traffic or suspicious activities. Several recent NIDS datasets have been published, however, the lack of baseline experimental results on the full version of datasets had made it difficult for researchers to perform benchmarking. As the train-test distribution of the datasets has yet to be pre-defined by the creators, this further obstruct the researchers to compare the performance unbiasedly across each of the machine classifiers. Moreover, cross-validation resampling scheme have also been addressed in the literatures to be inappropriate in the domain of NIDS. Thus, rolling-origin – a standard resampling technique which is also known as a common cross-validation scheme in the forecasting domain is employed to allocate the training and testing distributions. In this paper, rigorous experiments are conducted on the full version of the three recent NIDS datasets: GureKDDCup, UNSW-NB15, and CIDDS-001. While the datasets chosen might not be the latest available datasets, we have selected them as they include the essential IP address fields which are usually missing or removed due to some sort of privacy concerns. To deliver the baseline empirical results, 10 well-known classifiers from Weka are utilized. |
Databáze: | OpenAIRE |
Externí odkaz: |