MAYOR: Machine Learning and Analytics for Automated Operations and Recovery

Autor: Abhaya Asthana, Carlos Bernal, Manuel Castejon, Rashid Mijumbi
Rok vydání: 2019
Předmět:
Zdroj: ICCCN
DOI: 10.1109/icccn.2019.8847092
Popis: Communications systems continuously generate a big number of alarms. Such alarms are usually monitored by network operations centers (NOCs) from where steps to resolve the causes are launched either automatically or through a ticketing system. In order to respond to a practical number of alarms in real-time, automation is a must. This problem is more so in virtualized infrastructure since the number of alarm generating entities in such networks is significantly increased because their monitoring has to be performed for both physical as well as virtual functions. In this paper, we propose MAYOR: a suite of machine learning and analytics algorithms for automated operations and recovery. MAYOR is made up of a model generation entity which uses long term historic data to determine alarm persistence times, clusters, and patterns. To this, we model alarm persistence time as a normal distribution, and use the resulting cumulative distribution function to determine the time with an appropriate confidence. Moreover, we use sequential pattern mining and linear correlation to create alarm clusters. Finally, decision trees are used to create patterns between alarms as association rules. In addition, the system also has an adaptation entity that uses realtime alarms to perform short term adaptations. MAYOR has been implemented and evaluated using real telecommunications network alarm data as well as NOC settings. Evaluations show that the proposed persistence times can reduce 20% of static ones by atleast 80%, and that at least 23% of alarms can be predicted 1 hour before they appear with an accuracy of at least 80%.
Databáze: OpenAIRE