Multi-lingual Concept Extraction with Linked Data and Human-in-the-Loop
Autor: | Daniel Gruhl, Steve Welch, Petar Ristoski, Alfredo Alba, Anna Lisa Gentile, Anni R. Coden |
---|---|
Rok vydání: | 2017 |
Předmět: |
Structure (mathematical logic)
Focus (computing) Information retrieval Computer science Bootstrapping (linguistics) 02 engineering and technology Linked data Ontology (information science) Domain (software engineering) 020204 information systems 0202 electrical engineering electronic engineering information engineering Human-in-the-loop 020201 artificial intelligence & image processing Semantic Web |
Zdroj: | K-CAP |
Popis: | Ontologies are dynamic artifacts that evolve both in structure and content. Keeping them up-to-date is a very expensive and critical operation for any application relying on semantic Web technologies. In this paper we focus on evolving the content of an ontology by extracting relevant instances of ontological concepts from text. We propose a novel technique which is (i) completely language independent, (ii) combines statistical methods with human-in-the-loop and (iii) exploits Linked Data as bootstrapping source. Our experiments on a publicly available medical corpus and on a Twitter dataset show that the proposed solution achieves comparable performances regardless of language, domain and style of text. Given that the method relies on a human-in-the-loop, our results can be safely fed directly back into Linked Data resources. |
Databáze: | OpenAIRE |
Externí odkaz: |