Reining in Long Tails in Warehouse-Scale Computers with Quick Voltage Boosting Using Adrenaline

Autor:	Ronald G. Dreslinski, Michael A. Laurenzano, David Meisner, Lingjia Tang, Jason Mars, Yunqi Zhang, Thomas F. Wenisch, Chang-Hong Hsu
Rok vydání:	2017
Předmět:	010302 applied physics Boosting (machine learning) General Computer Science Computer science business.industry Quality of service Real-time computing 02 engineering and technology 01 natural sciences Power budget 020202 computer hardware & architecture Sliding window protocol Embedded system 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Long tail Latency (engineering) Frequency scaling business Efficient energy use
Zdroj:	ACM Transactions on Computer Systems. 35:1-33
ISSN:	1557-7333 0734-2071
DOI:	10.1145/3054742
Popis:	Reducing the long tail of the query latency distribution in modern warehouse scale computers is critical for improving performance and quality of service (QoS) of workloads such as Web Search and Memcached. Traditional turbo boost increases a processor’s voltage and frequency during a coarse-grained sliding window, boosting all queries that are processed during that window. However, the inability of such a technique to pinpoint tail queries for boosting limits its tail reduction benefit. In this work, we propose Adrenaline , an approach to leverage finer-granularity (tens of nanoseconds) voltage boosting to effectively rein in the tail latency with query-level precision. Two key insights underlie this work. First, emerging finer granularity voltage/frequency boosting is an enabling mechanism for intelligent allocation of the power budget to precisely boost only the queries that contribute to the tail latency; second, per-query characteristics can be used to design indicators for proactively pinpointing these queries, triggering boosting accordingly. Based on these insights, Adrenaline effectively pinpoints and boosts queries that are likely to increase the tail distribution and can reap more benefit from the voltage/frequency boost. By evaluating under various workload configurations, we demonstrate the effectiveness of our methodology. We achieve up to a 2.50 × tail latency improvement for Memcached and up to a 3.03 × for Web Search over coarse-grained dynamic voltage and frequency scaling (DVFS) given a fixed boosting power budget. When optimizing for energy reduction, Adrenaline achieves up to a 1.81 × improvement for Memcached and up to a 1.99 × for Web Search over coarse-grained DVFS. By using the carefully chosen boost thresholds, Adrenaline further improves the tail latency reduction to 4.82 × over coarse-grained DVFS.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::8806cb237798ef25174d656606fe1d81 https://doi.org/10.1145/3054742 Zobrazit plný text záznamu Plný text ve formátu PDF