Delexicalized Paraphrase Generation
Autor: | Wael Hamza, Konstantine Arkoudas, Boya Yu |
---|---|
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Computer Science - Artificial Intelligence Computer science media_common.quotation_subject Natural language understanding 010501 environmental sciences computer.software_genre Semantics 01 natural sciences Convolutional neural network Paraphrase Machine Learning (cs.LG) Semantic equivalence Named-entity recognition 0502 economics and business Quality (business) 050207 economics 0105 earth and related environmental sciences media_common Computer Science - Computation and Language business.industry 05 social sciences Artificial Intelligence (cs.AI) Artificial intelligence business computer Computation and Language (cs.CL) Natural language processing |
Zdroj: | COLING (Industry) |
DOI: | 10.48550/arxiv.2012.02763 |
Popis: | We present a neural model for paraphrasing and train it to generate delexicalized sentences. We achieve this by creating training data in which each input is paired with a number of reference paraphrases. These sets of reference paraphrases represent a weak type of semantic equivalence based on annotated slots and intents. To understand semantics from different types of slots, other than anonymizing slots, we apply convolutional neural networks (CNN) prior to pooling on slot values and use pointers to locate slots in the output. We show empirically that the generated paraphrases are of high quality, leading to an additional 1.29% exact match on live utterances. We also show that natural language understanding (NLU) tasks, such as intent classification and named entity recognition, can benefit from data augmentation using automatically generated paraphrases. |
Databáze: | OpenAIRE |
Externí odkaz: |