Network Telemetry by Observing and Recording on Programmable Data Plane

Autor: Wai-Xi Liu, Wen-Hong Lin, Song Wu, Sen Ling, Jin-Jiang Fu, Xing Liang, Zhi-Tao Chen, Gui-Feng Chen
Rok vydání: 2021
Předmět:
Zdroj: Networking
DOI: 10.23919/ifipnetworking52078.2021.9472807
Popis: Fine-grained, real-time, and accurate monitoring data can better help detect equipment failure and perform traffic engineering. However, existing in-band network telemetry (INT) implementations still exhibit a few drawbacks such as lack of real-time monitoring, relatively high overheads due to per-packet operation, and limited monitoring range. This paper proposes an INT+PDP-based fine-grained real-time telemetry scheme by observing and recording on the programmable data plane (PDP), referred to as O&R. The key idea lies in designing some registers on data plane to observe the states of packets forwarded by it as well as adding a customized header on a normal data packet to record how it is forwarded on its routing path. Except for measuring some conventional performance parameters such as end-to-end delay, jitter, throughput, and packet loss rate, O&R designs a clock offset elimination algorithm to realize the time synchronization of two adjacent switches, based on which we can complete more fine-grained measurement such as queuing delay, processing delay, transmission delay, and propagation delay on any hop. O&R also can measure the queue state that includes real-time queue depth and how many flows share the queue. Extensive experimental results for the $\mathrm{K}=4$ fat-tree data-center network demonstrate the effectiveness of O&R in terms of higher accuracy, better real-time performance, less overheads, and better fine-graining compared to existing schemes. The measurement accuracy of O&R is 46.3% higher than that of INT-like method. The measurement delay of O&R is ∼1 ms, while INT-like method needs ∼20 ms. The measurement overhead of O&R is only 2.19% of Pingmesh.
Databáze: OpenAIRE