Combining Deep Learning and Symbolic Processing for Extracting Knowledge from Raw Text

Autor:	Stefan Knerr, Jérémy Morvan, Marco Gori, Andrea Zugarini, Stefano Melacci
Rok vydání:	2018
Předmět:	Computer science business.industry Deep learning Computer Science (all) 02 engineering and technology computer.software_genre Deep Learning Information Extraction Learning from Constraints Symbolic knowledge representation Theoretical Computer Science Information extraction 020204 information systems 0202 electrical engineering electronic engineering information engineering Entropy (information theory) 020201 artificial intelligence & image processing Artificial intelligence Architecture business computer Symbolic processing Natural language processing
Zdroj:	Artificial Neural Networks in Pattern Recognition ISBN: 9783319999777 ANNPR
Popis:	This paper faces the problem of extracting knowledge from raw text. We present a deep architecture in the framework of Learning from Constraints [5] that is trained to identify mentions to entities and relations belonging to a given ontology. Each input word is encoded into two latent representations with different coverage of the local context, that are exploited to predict the type of entity and of relation to which the word belongs. Our model combines an entropy-based regularizer and a set of First-Order Logic formulas that bridge the predictions on entity and relation types accordingly to the ontology structure. As a result, the system generates symbolic descriptions of the raw text that are interpretable and well-suited to attach human-level knowledge. We evaluate the model on a dataset composed of sentences about simple facts, that we make publicly available. The proposed system can efficiently learn to discover mentions with very few human supervisions and that the relation to knowledge in the form of logic constraints improves the quality of the system predictions.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bbed79f1a49c2820ca36de465fcfcc4e https://doi.org/10.1007/978-3-319-99978-4_7 Zobrazit plný text záznamu