Multi-lingual Concept Extraction with Linked Data and Human-in-the-Loop

Autor: Daniel Gruhl, Steve Welch, Petar Ristoski, Alfredo Alba, Anna Lisa Gentile, Anni R. Coden
Rok vydání: 2017
Předmět:
Zdroj: K-CAP
Popis: Ontologies are dynamic artifacts that evolve both in structure and content. Keeping them up-to-date is a very expensive and critical operation for any application relying on semantic Web technologies. In this paper we focus on evolving the content of an ontology by extracting relevant instances of ontological concepts from text. We propose a novel technique which is (i) completely language independent, (ii) combines statistical methods with human-in-the-loop and (iii) exploits Linked Data as bootstrapping source. Our experiments on a publicly available medical corpus and on a Twitter dataset show that the proposed solution achieves comparable performances regardless of language, domain and style of text. Given that the method relies on a human-in-the-loop, our results can be safely fed directly back into Linked Data resources.
Databáze: OpenAIRE