Early classification of residential networks traffic using C5.0 machine learning algorithm

Autor: Abdesselem Kortebi, Zied Aouini, Yacine Ghamri-Doudane, Iyad Lahsen Cherif
Rok vydání: 2018
Zdroj: Wireless Days
Popis: A reliable traffic identification engine is a key component for Internet Service Providers (ISPs) to tune up their networks to meet customers' requirements. The continuously evolving characteristics of Internet traffic along with traffic encryption are challenging the reliability of classical approaches (i.e. port-based, pattern matching). A large body of the literature aims to overcome these challenges using machine learning based methods. However, several gaps limit the deployment of these approaches. In this paper, we focus on providing a fine-grained early residential traffic classification approach considering the lessons learnt from the literature. Our machine learning approach can identify finely services based on the very first packets statistical features. Furthermore, the methodology we developed aims to overcome commonly identified validation issues. Our dataset consists of a real residential traffic capture collected in France and provided by a major ISP involving more than 34,000 customers. Moreover, we developed an extension for an existing open source tool to provide the community with a reliable data processing chain. Our solution achieves very promising accuracy (98.8%) while identifying encrypted services such as Facebook, Google Services or Skype.
Databáze: OpenAIRE