On the role of fitness, precision, generalization and simplicity in process discovery
Autor: | Buijs, J.C.A.M., Dongen, van, B.F., Aalst, van der, W.M.P., Meersman, R. |
---|---|
Přispěvatelé: | Process Science |
Jazyk: | angličtina |
Rok vydání: | 2013 |
Předmět: |
Process modeling
Event (computing) business.industry Computer science Generalization media_common.quotation_subject Work in process Machine learning computer.software_genre Measure (mathematics) Business process discovery Quality (business) Simplicity Artificial intelligence business computer media_common |
Zdroj: | On the Move to Meaningful Internet Systems: OTM 2012 ISBN: 9783642336058 OTM Conferences (1) On the Move to Meaningful Internet Systems: OTM 2012 (Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2012, Rome, Italy, September 10-14, 2012. Proceedings, Part I), 305-322 STARTPAGE=305;ENDPAGE=322;TITLE=On the Move to Meaningful Internet Systems: OTM 2012 (Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2012, Rome, Italy, September 10-14, 2012. Proceedings, Part I) |
ISSN: | 0302-9743 |
DOI: | 10.1007/978-3-642-33606-5_19 |
Popis: | Process discovery algorithms typically aim at discovering process models from event logs that best describe the recorded behavior. Often, the quality of a process discovery algorithm is measured by quantifying to what extent the resulting model can reproduce the behavior in the log, i.e. replay fitness. At the same time, there are many other metrics that compare a model with recorded behavior in terms of the precision of the model and the extent to which the model generalizes the behavior in the log. Furthermore, several metrics exist to measure the complexity of a model irrespective of the log. In this paper, we show that existing process discovery algorithms typically consider at most two out of the four main quality dimensions: replay fitness, precision, generalization and simplicity. Moreover, existing approaches can not steer the discovery process based on user-defined weights for the four quality dimensions. This paper also presents the ETM algorithm which allows the user to seamlessly steer the discovery process based on preferences with respect to the four quality dimensions. We show that all dimensions are important for process discovery. However, it only makes sense to consider precision, generalization and simplicity if the replay fitness is acceptable. |
Databáze: | OpenAIRE |
Externí odkaz: |