Customizable text generation via conditional text generative adversarial network
Autor: | Jinyin Chen, Chengyu Jia, Yangyang Wu, Haibin Zheng, Guohan Huang |
---|---|
Rok vydání: | 2020 |
Předmět: |
0209 industrial biotechnology
Machine translation Computer science business.industry Orientation (computer vision) Cognitive Neuroscience 02 engineering and technology computer.software_genre Computer Science Applications 020901 industrial engineering & automation Artificial Intelligence Noun Metric (mathematics) ComputingMethodologies_DOCUMENTANDTEXTPROCESSING 0202 electrical engineering electronic engineering information engineering Text generation 020201 artificial intelligence & image processing Artificial intelligence business computer Generative adversarial network Natural language processing |
Zdroj: | Neurocomputing. 416:125-135 |
ISSN: | 0925-2312 |
Popis: | Automatically generating meaningful and coherent text has many applications, such as machine translation, dialogue systems, BOT application, etc. Text generation technology has attracted more attention over the past decades. A bunch of excellent methods are proposed; however, there are still challenges to generate text rivals the real one by human, such as most machines output fixed length text, or can only generate text quite the same with the input training text. In this paper, we put forward a novel text generation system, called customizable conditional text generative adversarial network, which is capable of generating diverse text content of variable length with customizable emotion label. It is more convenient for generating actual original text with specific sensitive orientation. We propose a conditional text generative adversarial network (CTGAN), in which emotion label is adopted as an input channel to specify the output text, and variable length text generation strategy is put forward. After generating initial texts by CTGAN, to make the generated text data match the real scene, we design an automated word-level replacement strategy, which extracts the keywords (e.g. nouns) from the training texts and replaces the specific keywords in the generated texts. Finally, we design a comprehensive evaluation metric based on various text evaluations, called mixed evaluation metric. Comprehensive experiments on real-world datasets testify that our proposed CTGAN behaves better than other text generation methods, i.e., generated text are more real compared with the real text than other generation methods, achieving state-of-the-art generation performance. |
Databáze: | OpenAIRE |
Externí odkaz: |