Customizable text generation via conditional text generative adversarial network

Autor:	Jinyin Chen, Chengyu Jia, Yangyang Wu, Haibin Zheng, Guohan Huang
Rok vydání:	2020
Předmět:	0209 industrial biotechnology Machine translation Computer science business.industry Orientation (computer vision) Cognitive Neuroscience 02 engineering and technology computer.software_genre Computer Science Applications 020901 industrial engineering & automation Artificial Intelligence Noun Metric (mathematics) ComputingMethodologies_DOCUMENTANDTEXTPROCESSING 0202 electrical engineering electronic engineering information engineering Text generation 020201 artificial intelligence & image processing Artificial intelligence business computer Generative adversarial network Natural language processing
Zdroj:	Neurocomputing. 416:125-135
ISSN:	0925-2312
Popis:	Automatically generating meaningful and coherent text has many applications, such as machine translation, dialogue systems, BOT application, etc. Text generation technology has attracted more attention over the past decades. A bunch of excellent methods are proposed; however, there are still challenges to generate text rivals the real one by human, such as most machines output fixed length text, or can only generate text quite the same with the input training text. In this paper, we put forward a novel text generation system, called customizable conditional text generative adversarial network, which is capable of generating diverse text content of variable length with customizable emotion label. It is more convenient for generating actual original text with specific sensitive orientation. We propose a conditional text generative adversarial network (CTGAN), in which emotion label is adopted as an input channel to specify the output text, and variable length text generation strategy is put forward. After generating initial texts by CTGAN, to make the generated text data match the real scene, we design an automated word-level replacement strategy, which extracts the keywords (e.g. nouns) from the training texts and replaces the specific keywords in the generated texts. Finally, we design a comprehensive evaluation metric based on various text evaluations, called mixed evaluation metric. Comprehensive experiments on real-world datasets testify that our proposed CTGAN behaves better than other text generation methods, i.e., generated text are more real compared with the real text than other generation methods, achieving state-of-the-art generation performance.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::21b0dda2edd6c6c43af82a5a5e735bfe https://doi.org/10.1016/j.neucom.2018.12.092 Zobrazit plný text záznamu Full Text from ScienceDirect