Popis: |
INFN-CNAF is one of the Worldwide LHC Computing Grid (WLCG) Tier-1 data centres, providing computing, networking and storage resources to a wide variety of scientific collaborations, not limited to the four LHC (Large Hadron Collider) experiments. The INFN-CNAF data centre will move to a new location next year. At the same time, the requirements from our experiments and users are becoming increasingly challenging and new scientific communities have started or will soon start exploiting our resources. Currently, we are reengineering several services, in particular our monitoring infrastructure, in order to improve the day-by-day operations and to cope with the increasing complexity of the use cases and with the future expansion of the centre. This scenario led us to implement a data streaming infrastructure designed to enable log analysis, anomaly detection, threat hunting, integrity monitoring and incident response. Such data streaming platform has been organised to manage different kinds of data coming from heterogeneous sources, to support multi-tenancy and to be scalable. Moreover, we will be able to provide an on demand end-to-end data streaming application to those users/communities requesting such kind of facility. The infrastructure is based on the Apache Kafka platform, which provides streaming of events at large scale, with authorization and authentication configured at the topic level for ensuring data isolation and protection. Data can be consumed by different applications, such as those devoted to log analysis, which provide the capability to index large amounts of data and implement appropriate access policies to inspect and visualise information. In this contribution we will present and motivate our technological choices for the definition of the infrastructure, we will describe its components and we will depict use cases which can be addressed with this platform. |