Popis: |
Recognizing that two Semantic Web documents or graphs are similar, and characterizing their differences is useful in many tasks, including retrieval, updating, version control and knowledge base editing. We describe a number of text based similarity metrics that characterize the relation between Semantic Web graphs and evaluate these metrics for three specific cases of similarity that we have identified: similarity in classes and properties used while differing only in literal content, difference only in base-URI, and versioning relationship. When one graph is judged to be a version of another, we generate a “delta” consisting of of triples to be added or removed from one graph to make them equivalent. This method takes into account the text of the RDF graph’s serialization as a document, rather than relying solely on the document URI. We have prototyped these techniques in a system that we call Similis and evaluated its performance on several tasks using a collection of graphs from the archive of the Swoogle Semantic Web search engine. |