Learning in Text Streams: Discovery and Disambiguation of Entity and Relation Instances

Autor: Giuseppe Marra, Andrea Zugarini, Stefano Melacci, Marco Maggini
Rok vydání: 2020
Předmět:
FOS: Computer and information sciences
Computer Science - Machine Learning
Relation (database)
Computer Networks and Communications
Computer science
media_common.quotation_subject
Machine Learning (stat.ML)
02 engineering and technology
computer.software_genre
Machine Learning (cs.LG)
Data modeling
Knowledge-based systems
Machine learning
natural language processing (NLP)
recurrent neural networks
Statistics - Machine Learning
Artificial Intelligence
Reading (process)
0202 electrical engineering
electronic engineering
information engineering

Set (psychology)
media_common
Computer Science - Computation and Language
business.industry
Computer Science Applications
Knowledge base
Identity (object-oriented programming)
Encyclopedia
020201 artificial intelligence & image processing
Artificial intelligence
business
Computation and Language (cs.CL)
computer
Software
Natural language processing
Zdroj: IEEE Transactions on Neural Networks and Learning Systems. 31:4475-4486
ISSN: 2162-2388
2162-237X
Popis: We consider a scenario where an artificial agent is reading a stream of text composed of a set of narrations, and it is informed about the identity of some of the individuals that are mentioned in the text portion that is currently being read. The agent is expected to learn to follow the narrations, thus disambiguating mentions and discovering new individuals. We focus on the case in which individuals are entities and relations, and we propose an end-to-end trainable memory network that learns to discover and disambiguate them in an online manner, performing one-shot learning, and dealing with a small number of sparse supervisions. Our system builds a not-given-in-advance knowledge base, and it improves its skills while reading unsupervised text. The model deals with abrupt changes in the narration, taking into account their effects when resolving co-references. We showcase the strong disambiguation and discovery skills of our model on a corpus of Wikipedia documents and on a newly introduced dataset, that we make publicly available.
Databáze: OpenAIRE