Single-View Performance Monitoring of On-Line Applications Running on a Cloud

Autor: Yasuhiko Kanemasa, Suzuki Shuji, Atsushi Kubota, Higuchi Junichi
Rok vydání: 2017
Předmět:
Zdroj: CLOUD
DOI: 10.1109/cloud.2017.52
Popis: An important concern for current IaaS cloud providers is to be falsely accused by cloud users for the response delays of their applications running on the cloud. Since cloud computing brings an additional virtualization layer between guest-OS and hardware resources, the relationship between the performance of an application and its resource consumption becomes obscure. Only monitoring resource consumption (e.g., CPU, memory, disk I/O and so on) is often inadequate to detect the QoS degradation of a deployed application, not to mention resolving cloud users' QoS complaints. In this paper, we introduce a novel single-view performance visualization technique for a cloud platform to monitor the response delays of all the on-line applications running on it. Our technique sniffs network packets from a network switch in the platform and analyzes their flow to estimate the response time of each application. An important merit of this approach is the easy installation into a cloud platform. Concretely, our approach only requires changing the configuration of a network switch to enable port mirroring. To detect and visualize the response delay of each application in a single view, we convert the absolute response time of each application to a comparable value of delay degree ("delay ratio") which shows how serious the response delays compared to its baseline. The baseline is the service time of each application, which we estimate from the distribution of observed response times (service time + waiting time) by adopting an exponentially modified Gaussian (EMG) distribution fitting with an outlier elimination. We validate our approach through extensive experiments of a representative on-line application benchmark (RUBBoS). The results show high repeatability of the service time estimation (the standard deviation of 40 trials is 0.11). The results also show that our performance monitoring technique is able to accurately visualize response delays even when a training set of response times has a heavy-tailed distribution.
Databáze: OpenAIRE