Interactive ontology debugging: two query strategies for efficient fault localization

Autor: Shchekotykhin, Kostyantyn, Friedrich, Gerhard, Fleiss, Philipp, Rodler, Patrick
Rok vydání: 2011
Předmět:
Zdroj: Journal of Web Semantics 12 (2012) 88-103
Druh dokumentu: Working Paper
DOI: 10.1016/j.websem.2011.12.006
Popis: Effective debugging of ontologies is an important prerequisite for their broad application, especially in areas that rely on everyday users to create and maintain knowledge bases, such as the Semantic Web. In such systems ontologies capture formalized vocabularies of terms shared by its users. However in many cases users have different local views of the domain, i.e. of the context in which a given term is used. Inappropriate usage of terms together with natural complications when formulating and understanding logical descriptions may result in faulty ontologies. Recent ontology debugging approaches use diagnosis methods to identify causes of the faults. In most debugging scenarios these methods return many alternative diagnoses, thus placing the burden of fault localization on the user. This paper demonstrates how the target diagnosis can be identified by performing a sequence of observations, that is, by querying an oracle about entailments of the target ontology. To identify the best query we propose two query selection strategies: a simple "split-in-half" strategy and an entropy-based strategy. The latter allows knowledge about typical user errors to be exploited to minimize the number of queries. Our evaluation showed that the entropy-based method significantly reduces the number of required queries compared to the "split-in-half" approach. We experimented with different probability distributions of user errors and different qualities of the a-priori probabilities. Our measurements demonstrated the superiority of entropy-based query selection even in cases where all fault probabilities are equal, i.e. where no information about typical user errors is available.
Comment: Published in Web Semantics: Science, Services and Agents on the World Wide Web. arXiv admin note: substantial text overlap with arXiv:1004.5339
Databáze: arXiv