Compositional Generalization via Parsing Tree Annotation

Autor: Segwang Kim, Joonyoung Kim, Kyomin Jung
Rok vydání: 2021
Předmět:
Artificial intelligence
General Computer Science
Computer science
Principle of compositionality
media_common.quotation_subject
02 engineering and technology
010501 environmental sciences
computer.software_genre
Semantics
01 natural sciences
0202 electrical engineering
electronic engineering
information engineering

General Materials Science
natural language processing
0105 earth and related environmental sciences
Transformer (machine learning model)
media_common
Parsing
Grammar
Artificial neural network
business.industry
Deep learning
General Engineering
neural networks
Syntax
Task (computing)
Tree (data structure)
Delimiter
020201 artificial intelligence & image processing
lcsh:Electrical engineering. Electronics. Nuclear engineering
business
lcsh:TK1-9971
computer
Natural language processing
Sentence
Zdroj: IEEE Access, Vol 9, Pp 24326-24333 (2021)
ISSN: 2169-3536
DOI: 10.1109/access.2021.3055513
Popis: Humans can understand a novel sentence by parsing it into known components like phrases and clauses. To achieve human-level artificial intelligence, compositional generalization tasks are suggested and used to assess machine learning models. Among those tasks, the SCAN tasks are challenging for the standard deep learning models, such as RNN sequence-to-sequence models and Transformers, that show great success across many natural language processing tasks. Even though a long line of deep learning research has developed memory augmented neural networks aimed at the SCAN tasks, their generalities remain questionable for more complex and realistic applications where the standard seq2seq models dominate. Hence, one needs to propose a method that helps the standard models to discover compositional rules. To this end, we propose a data augmentation technique using paring trees. Our technique annotates targets by inserting a new delimiter token in between them according to their parsing trees. For the training stage, the technique needs prior knowledge about the targets' semantic or syntactic compositionality. On the other hand, for the test stage, the technique uses no such knowledge. Experiments show that our technique enables the standard models to achieve compositional generalization on the SCAN tasks. Furthermore, we validate our technique on a synthetic task and confirm the standard models' strong performance gains without using prior knowledge about semantic compositionality. As one way to infuse parsing tree information into sequences, our technique can be used for tasks with structured targets like program code generation tasks.
Databáze: OpenAIRE