ProvAnalyser: A Framework for Scientific Workflows Provenance
Autor: | Peter Fitch, Anila Sahar Butt |
---|---|
Rok vydání: | 2021 |
Předmět: |
Dependency (UML)
business.industry Event (computing) Computer science 02 engineering and technology computer.file_format Ontology (information science) Workflow engine Task (computing) Workflow 020204 information systems 0202 electrical engineering electronic engineering information engineering SPARQL 020201 artificial intelligence & image processing Use case Software engineering business computer |
Zdroj: | Communications in Computer and Information Science ISBN: 9783030674441 MODELSWARD (Revised Selected Papers) |
DOI: | 10.1007/978-3-030-67445-8_5 |
Popis: | The increasing ability of data-driven science is resulting in a growing need for applications that are under the control of data-centric workflows, also known as scientific workflows. The focus of this work is on provenance collection for these workflows, necessary to validate the workflow and to determine the quality of generated data products. However, the act of instrumenting a workflow engine for provenance collection is burdensome. This complex task requires adding hooks to the workflow engine to capture provenance, which can cause perturbation in execution. We address the challenge of extracting provenance data in the form of a knowledge graph from the event logs of the workflows to record critical information about the applications and the workflows. We present an ontology-based framework for provenance collection using the event logs of workflow engine. Further, we reduce provenance use cases to SPARQL queries over captured provenance knowledge graph. Performance evaluation demonstrates that the framework is capable of reconstructing complete data and invocation dependency graphs from one or various execution traces. |
Databáze: | OpenAIRE |
Externí odkaz: |