A Two-Stage Heavy Hitter Detection System Based on CPU Spikes at Cloud-Scale Gateways

Autor: Mao Miao, Guangzhe Zhou, Shunmin Zhu, Tian Pan, Biao Lyu, Jianyuan Lu, Yining Qi, Shan He
Rok vydání: 2021
Předmět:
Zdroj: ICDCS
DOI: 10.1109/icdcs51616.2021.00041
Popis: The cloud network provides sharing resources for tens of thousands of tenants to achieve economics of scale. However, heavy hitters caused by a single tenant will probably interfere with the processing of the cloud gateways, undermining the predictable performance expected by other cloud tenants. To prevent it, heavy hitter detection becomes a key concern at the performance-critical cloud gateways but faces the dilemma between fine granularity and low overhead. In this work, we present CloudSentry, a scalable two-stage heavy hitter detection system dedicated to multi-tenant cloud gateways against such a dilemma. CloudSentry contains a lightweight coarse-grained detection running 24/7 to localize infrequent CPU spikes. Then it invokes a fine-grained detection to precisely dump and analyze the potential heavy-hitter packets at the CPU spikes. After that, a more comprehensive analysis is conducted to associate heavy hitters with the cloud service scenarios and invoke a corresponding backpressure procedure. CloudSentry significantly reduces memory, computation and storage overhead compared with existing approaches. Additionally, it has been deployed world-wide in Alibaba Cloud for over one year, with rich deployment experiences. In a gateway cluster under an average traffic throughput of of 251Gbps, CloudSentry consumes only a fraction of 2%-5% CPU utilization with 8KB run-time memory, producing only 10MB heavy hitter logs during one month.
Databáze: OpenAIRE