BFTBrain: Adaptive BFT Consensus with Reinforcement Learning

Autor:	Wu, Chenyuan, Qin, Haoyun, Amiri, Mohammad Javad, Loo, Boon Thau, Malkhi, Dahlia, Marcus, Ryan
Rok vydání:	2024
Předmět:	Computer Science - Distributed Parallel and Cluster Computing
Druh dokumentu:	Working Paper
Popis:	This paper presents BFTBrain, a reinforcement learning (RL) based Byzantine fault-tolerant (BFT) system that provides significant operational benefits: a plug-and-play system suitable for a broad set of hardware and network configurations, and adjusts effectively in real-time to changing fault scenarios and workloads. BFTBrain adapts to system conditions and application needs by switching between a set of BFT protocols in real-time. Two main advances contribute to BFTBrain's agility and performance. First, BFTBrain is based on a systematic, thorough modeling of metrics that correlate the performance of the studied BFT protocols with varying fault scenarios and workloads. These metrics are fed as features to BFTBrain's RL engine in order to choose the best-performing BFT protocols in real-time. Second, BFTBrain coordinates RL in a decentralized manner which is resilient to adversarial data pollution, where nodes share local metering values and reach the same learning output by consensus. As a result, in addition to providing significant operational benefits, BFTBrain improves throughput over fixed protocols by $18\%$ to $119\%$ under dynamic conditions and outperforms state-of-the-art learning based approaches by $44\%$ to $154\%$. Comment: To appear in 22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI), April 2025
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2408.06432 Zobrazit plný text záznamu View this record from Arxiv