CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines

Autor: Karel Hynek, Jan Luxemburk, Jaroslav Pešek, Tomáš Čejka, Pavel Šiška
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Scientific Data, Vol 11, Iss 1, Pp 1-10 (2024)
Druh dokumentu: article
ISSN: 2052-4463
DOI: 10.1038/s41597-024-03927-4
Popis: Abstract The modern approach for network traffic classification (TC), which is an important part of operating and securing networks, is to use machine learning (ML) models that are able to learn intricate relationships between traffic characteristics and communicating applications. A crucial prerequisite is having representative datasets. However, datasets collected from real production networks are not being published in sufficient numbers. Thus, this paper presents a novel dataset, CESNET-TLS-Year22, that captures the evolution of TLS traffic in an ISP network over a year. The dataset contains 180 web service labels and standard TC features, such as packet sequences. The unique year-long time span enables comprehensive evaluation of TC models and assessment of their robustness in the face of the ever-changing environment of production networks.
Databáze: Directory of Open Access Journals