Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators

Autor:	Shubhangi Tandon, Lena Reed, Marilyn A. Walker, Shereen Oraby, Stephanie M. Lukin, T S Sharath
Jazyk:	angličtina
Rok vydání:	2018
Předmět:	FOS: Computer and information sciences Computer science media_common.quotation_subject Fidelity 02 engineering and technology 010501 environmental sciences computer.software_genre Semantics 01 natural sciences Domain (software engineering) Style (sociolinguistics) 0202 electrical engineering electronic engineering information engineering Control (linguistics) 0105 earth and related environmental sciences media_common Computer Science - Computation and Language business.industry 020201 artificial intelligence & image processing Artificial intelligence business computer Computation and Language (cs.CL) Natural language processing Natural language Utterance Generator (mathematics)
Zdroj:	SIGDIAL Conference
Popis:	Natural language generators for task-oriented dialogue must effectively realize system dialogue actions and their associated semantics. In many applications, it is also desirable for generators to control the style of an utterance. To date, work on task-oriented neural generation has primarily focused on semantic fidelity rather than achieving stylistic goals, while work on style has been done in contexts where it is difficult to measure content preservation. Here we present three different sequence-to-sequence models and carefully test how well they disentangle content and style. We use a statistical generator, Personage, to synthesize a new corpus of over 88,000 restaurant domain utterances whose style varies according to models of personality, giving us total control over both the semantic content and the stylistic variation in the training data. We then vary the amount of explicit stylistic supervision given to the three models. We show that our most explicit model can simultaneously achieve high fidelity to both semantic and stylistic goals: this model adds a context vector of 36 stylistic parameters as input to the hidden state of the encoder at each time step, showing the benefits of explicit stylistic supervision, even when the amount of training data is large. To appear at SIGDIAL 2018
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c7de2edede75dffe2962ed292c7c6b0a http://arxiv.org/abs/1805.08352 Zobrazit plný text záznamu