Deployment of Batch Processing for Log File Analysis
Autor: | Timo Hämäläinen, Esa Heikkinen |
---|---|
Rok vydání: | 2020 |
Předmět: |
Database
Computer science Complex event processing 020206 networking & telecommunications 02 engineering and technology Python (programming language) computer.software_genre Temporal database Stream processing Software deployment 020204 information systems 0202 electrical engineering electronic engineering information engineering Batch processing Preprocessor computer Intelligent transportation system computer.programming_language |
Zdroj: | ICPS |
DOI: | 10.1109/icps48405.2020.9274712 |
Popis: | We have used log file analysis in mining expected behavior in intelligent transportation systems involving spatial and temporal data. The challenge is how to extract complex behavior from multiple traces, in which linear log analysis proceeding in a row by row order does not suffice. Complex Event Processing (CEP) is close to our need, but it is surprisingly difficult to set up and deploy general purpose frameworks to the purpose. This paper originates from the need to compare our custom LOGDIG tool to Apache Flink. This paper focuses on the deployment effort of the two, for which reason we consider setting up the development and run-time environments, selecting the proper analysis approach and evaluating the difficulty in five different aspects. While LOGDIG is written solely in Python, Flink is a combination of many languages, libraries, packages and tools. Our comparison includes Flink in batch and stream processing modes using external and internal preprocessing. We lend the Degree of Difficulty (DoD) measure from sports to assess the deployment effort. Flink needs significant setup effort for deploying the same functionality as LOGDIG. The former is continuously developing while LOGDIG is more focused and stable and can be used more easily off-the-self. |
Databáze: | OpenAIRE |
Externí odkaz: |