Using Gaussian Mixture Models to Detect Outliers in Seasonal Univariate Network Traffic

Autor: Aarthi M. Reddy, Johan Muedsam, Brad Ford, Austin Henslee, Melissa Lee, Ronen Kahana, Max Rao, Joshua Whitney, Matt Dugan, Meredith Ordway-West
Rok vydání: 2017
Předmět:
Zdroj: IEEE Symposium on Security and Privacy Workshops
DOI: 10.1109/spw.2017.9
Popis: This article presents an algorithm to detect outliers in seasonal, univariate network traffic data using Gaussian Mixture Models (GMMs). Additionally we show that this methodology can easily be implemented in a big data scenario and delivers the required information to a security analyst in an efficient manner. The unsupervised clustering algorithm GMM, is modified such that all data points in a set are labelled as either outliers or normal data points. In this article, the algorithm is only evaluated on time series data obtained from network traffic, however it can easily be modified to be used for other types of seasonal univariate big data sets. Detecting outliers in network traffic data occurs in two stages. First, GMMs are built for training data in each time bin of seasonal time series data. Outliers or anomalies are detected and removed in this training data set by examining the probability associated with each data point. Second, GMMs are rebuilt after outliers are removed in historical or training data and the re-computed GMMs are used to detect outliers in test data. Results are compared to traditional methods of outlier detection which usually treat all data from a set as coming from a single probability density function.
Databáze: OpenAIRE