Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators
Autor: | Shubhangi Tandon, Lena Reed, Marilyn A. Walker, Shereen Oraby, Stephanie M. Lukin, T S Sharath |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
FOS: Computer and information sciences
Computer science media_common.quotation_subject Fidelity 02 engineering and technology 010501 environmental sciences computer.software_genre Semantics 01 natural sciences Domain (software engineering) Style (sociolinguistics) 0202 electrical engineering electronic engineering information engineering Control (linguistics) 0105 earth and related environmental sciences media_common Computer Science - Computation and Language business.industry 020201 artificial intelligence & image processing Artificial intelligence business computer Computation and Language (cs.CL) Natural language processing Natural language Utterance Generator (mathematics) |
Zdroj: | SIGDIAL Conference |
Popis: | Natural language generators for task-oriented dialogue must effectively realize system dialogue actions and their associated semantics. In many applications, it is also desirable for generators to control the style of an utterance. To date, work on task-oriented neural generation has primarily focused on semantic fidelity rather than achieving stylistic goals, while work on style has been done in contexts where it is difficult to measure content preservation. Here we present three different sequence-to-sequence models and carefully test how well they disentangle content and style. We use a statistical generator, Personage, to synthesize a new corpus of over 88,000 restaurant domain utterances whose style varies according to models of personality, giving us total control over both the semantic content and the stylistic variation in the training data. We then vary the amount of explicit stylistic supervision given to the three models. We show that our most explicit model can simultaneously achieve high fidelity to both semantic and stylistic goals: this model adds a context vector of 36 stylistic parameters as input to the hidden state of the encoder at each time step, showing the benefits of explicit stylistic supervision, even when the amount of training data is large. To appear at SIGDIAL 2018 |
Databáze: | OpenAIRE |
Externí odkaz: |