Iris: Tuning the configuration parameters of NoSQL databases for high-throughput digital agricultural processing pipelines

Autor: Ashraf Mahgoub, Somali Chaterji, Rajesh Kumar
Rok vydání: 2019
Předmět:
Zdroj: 2019 Boston, Massachusetts July 7- July 10, 2019.
Popis: Precision agriculture (Precision AG) provides accurate farming techniques through advanced monitoring, measuring and timely decisions. Powered by NoSQL datastores, agricultural processing pipelines can now scale to levels beyond what can be achieved by traditional database management systems, such as PostgreSQl. However, tuning NoSQL datastores for high throughput and low latency under precision agriculture workloads are challenging for several reasons. First, NoSQL datasores have many performance-sensitive configuration parameters, Cassandra for example has 50. Second, the aggregate workload in precision AG environments changes overtime due to environmental changes such as flash floods or onset of crop diseases. With changes in the workload, new configuration parameters are needed to sustain optimal performance. In this paper, we introduce our system, Iris, to tune Redis, which is one of the most popular NoSQL datastores. First, we apply machine learning techniques to identify the most impactful performance-sensitive parameters to tune. Second, we use performance prediction models, deep learning and random forest variants, to serve as surrogate models for the NoSQL datastore. This allows for faster testing of new configuration parameters for the new workload compared to slow benchmarking of the actual NoSQL datastore by running it every time there is a new workload. Finally, we show that Iris achieves better performance than the NoSQL default configurations as well as the best-static configuration in both throughput and latency metrics.
Databáze: OpenAIRE