In data veritas

Autor: David Zhang, Ramesh Subramonian, Kapil Surlaker, Kishore Gopalakrishna, Zhen Zhang, Bob Schulman, Sajid Topiwala, Mihir Gandhi
Rok vydání: 2013
Předmět:
Zdroj: DBTest
DOI: 10.1145/2479440.2479448
Popis: The increasing deployment of distributed systems to solve large data and computational problems has not seen a concomitant increase in tools and techniques to test these systems. In this paper, we propose a data driven approach to testing. We translate our intuitions and expectations about how the system should behave into invariants, the truth of which can be verified from data emitted by the system. Our particular implementation of the invariants uses Q, a high-performance analytical database, programmed with a vector language.To show the practical value of this approach, we describe how it was used to test Helix, a distributed cluster manager deployed at LinkedIn. We make the case that looking at testing as an exercise in data analytics has the following benefits. It (a) increases the expressivity of the tests (b) decreases their fragility and (c) suggests additional, insightful ways to understand the system under test.As the title of the paper suggests, there is truth in the data --- we only need to look for it.
Databáze: OpenAIRE