Popis: |
Google introduced BBR representing a new model-based TCP class in 2016, which improves throughput and latency of Google's backbone and services and is now the second most popular TCP on the Internet. As BBR is designed as a general-purpose congestion control to replace current widely deployed congestion control such as Reno and CUBIC, this raises the importance of studying its performance in different types of networks. In this paper, we study BBR's performance in cloud networks, which have grown rapidly but have not been studied in the existing BBR works. For the first time, we show both analytically and experimentally that due to the virtual machine (VM) scheduling in cloud networks, BBR underestimates the pacing rate, delivery rate, and estimated bandwidth, which are three key elements of its control loop. This underestimation can exacerbate iteratively and exponentially over time, and can cause BBR's throughput to reduce to almost zero. We propose a BBR patch that captures the VM scheduling impact on BBR's model and improves its throughput in cloud networks. Our evaluation of the modified BBR on the testbed and EC2 shows a significant improvement in the throughput and bandwidth estimation accuracy over the original BBR in cloud networks with heavy VM scheduling. |