How Industry Tackles Anomalies during Runtime: Approaches and Key Monitoring Parameters

Autor: Steidl, Monika, Dornauer, Benedikt, Felderer, Michael, Ramler, Rudolf, Racasan, Mircea-Cristian, Gattringer, Marko
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
DOI: 10.1109/SEAA64295.2024.00062
Popis: Deviations from expected behavior during runtime, known as anomalies, have become more common due to the systems' complexity, especially for microservices. Consequently, analyzing runtime monitoring data, such as logs, traces for microservices, and metrics, is challenging due to the large volume of data collected. Developing effective rules or AI algorithms requires a deep understanding of this data to reliably detect unforeseen anomalies. This paper seeks to comprehend anomalies and current anomaly detection approaches across diverse industrial sectors. Additionally, it aims to pinpoint the parameters necessary for identifying anomalies via runtime monitoring data. Therefore, we conducted semi-structured interviews with fifteen industry participants who rely on anomaly detection during runtime. Additionally, to supplement information from the interviews, we performed a literature review focusing on anomaly detection approaches applied to industrial real-life datasets. Our paper (1) demonstrates the diversity of interpretations and examples of software anomalies during runtime and (2) explores the reasons behind choosing rule-based approaches in the industry over self-developed AI approaches. AI-based approaches have become prominent in published industry-related papers in the last three years. Furthermore, we (3) identified key monitoring parameters collected during runtime (logs, traces, and metrics) that assist practitioners in detecting anomalies during runtime without introducing bias in their anomaly detection approach due to inconclusive parameters.
Comment: accepted at 2024 50th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)
Databáze: arXiv