Improving Text Generation with Student-Forcing Optimal Transport

Autor:	Qian Yang, Chunyuan Li, Yizhe Zhang, Lawrence Carin, Wenlin Wang, Jianqiao Li, Liqun Chen, Yuh-Chen Lin, Hao Fu, Chenyang Tao, Guoyin Wang, Dinghan Shen, Ruiyi Zhang
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	Scheme (programming language) FOS: Computer and information sciences Computer Science - Machine Learning Interpretation (logic) Forcing (recursion theory) Computer Science - Computation and Language Machine translation Computer science Speech recognition 02 engineering and technology Extension (predicate logic) 010501 environmental sciences computer.software_genre 01 natural sciences Automatic summarization Machine Learning (cs.LG) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Language model computer Computation and Language (cs.CL) Word (computer architecture) 0105 earth and related environmental sciences computer.programming_language
Zdroj:	EMNLP (1)
Popis:	Neural language models are often trained with maximum likelihood estimation (MLE), where the next word is generated conditioned on the ground-truth word tokens. During testing, however, the model is instead conditioned on previously generated tokens, resulting in what is termed exposure bias. To reduce this gap between training and testing, we propose using optimal transport (OT) to match the sequences generated in these two modes. An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks. To appear at EMNLP 2020
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4cb90e0003c412a95b63c2561e0d0d31 http://arxiv.org/abs/2010.05994 Zobrazit plný text záznamu