A NLP-Oriented Methodology to Enhance Event Log Quality

Autor: F. Javier Ortega, María Teresa Gómez-López, Ángel Jesús Varela-Vaca, Belén Ramos-Gutiérrez, Moe Thandar Wynn
Přispěvatelé: Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Universidad de Sevilla. TIC134: Sistemas Informáticos, Universidad de Sevilla. TIC258: Data-centric Computing Research Hub, Ministerio de Ciencia e Innovación (MICIN). España, Junta de Andalucía
Rok vydání: 2021
Předmět:
Zdroj: Enterprise, Business-Process and Information Systems Modeling ISBN: 9783030791858
BPMDS/EMMSAD@CAiSE
idUS. Depósito de Investigación de la Universidad de Sevilla
Fundacion Sancho el Sabio Fundazioa (FSS)
ISSN: 2018-0942
DOI: 10.1007/978-3-030-79186-5_2
Popis: The quality of event logs is a crucial cornerstone for the feasibility of the application of later process mining techniques. The wide variety of data that can be included in an event log refer to information about the activity, such as what, who or where. In this paper, we focus on event logs that include textual information written in a natural language that contains exhaustive descriptions of activity executions. In this context, a pre-processing step is necessary since tex tual information is unstructured and it can contain inaccuracies that will provoke the impracticability of process mining techniques. For this reason, we propose a methodology that applies Natural Language Processing (NLP) to raw event log by relabelling activities. The approach let the customised description of the measure ment and assessment of the event log quality depending on expert requirements. Additionally, it guides the selection of the most suitable NLP techniques for use depending on the event log. The methodology has been evaluated using a real-life event log that includes detailed textual descriptions to capture the management of incidents in the aircraft assembly process in aerospace manufacturing. Ministerio de Ciencia e Innovación RTI2018-094283-B-C33 Ministerio de Ciencia e Innovación RTI2018-098062-A-I00 Junta de Andalucía P20-01224 (COPERNICA)
Databáze: OpenAIRE