Attention-Based Response Generation Using Parallel Double Q-Learning for Dialog Policy Decision in a Conversational System
Autor: | Ming-Hsiang Su, Chung-Hsien Wu, Liang-Yu Chen |
---|---|
Rok vydání: | 2020 |
Předmět: |
Parsing
Correctness Acoustics and Ultrasonics Computer science business.industry Q-learning computer.software_genre Autoencoder Dialog act 030507 speech-language pathology & audiology 03 medical and health sciences Computational Mathematics Computer Science (miscellaneous) Reinforcement learning Artificial intelligence Electrical and Electronic Engineering Dialog box 0305 other medical science business computer Sentence Natural language processing |
Zdroj: | IEEE/ACM Transactions on Audio, Speech, and Language Processing. 28:131-143 |
ISSN: | 2329-9304 2329-9290 |
DOI: | 10.1109/taslp.2019.2949687 |
Popis: | This article proposes an approach to response generation using a Parallel Double Q-learning algorithm for dialog policy decision in a conversational system. First, a new semantic representation of the user's input sentence is presented by using the CKIP parser to derive the semantic dependency sequence of the input sentence. Then, a Gated Recurrent Unit-based Autoencoder is used to obtain the user's turn representation as well as context representation. A Parallel Double Q-learning algorithm with a Deep Neural Network (PD-DQN), combining two Double DQNs in parallel for the contextual and semantic information in the user's message, respectively, are proposed to determine the dialog act. Finally, the user's input and the determined dialog act are fed to an attention-based Transformer model to generate the response template. With the generated response template, the semantic slots are filled with their corresponding values to obtain the final sentence response. This article collects a multi-turn conversation database consisting of 4186 turns in the travel domain and 447 chitchat question-answer pairs as the evaluation corpus. Five-fold cross validation is employed for performance evaluation. Experimental results show that the proposed approach based on semantic dependency for intent detection increases the accuracy by 4.3%. For dialog policy decision, the PD-DQN achieves 87.57% task success rate, which is 13.9% higher than the baseline Double DQN (73.67%). Finally, using the attention-based Transformer for response template generation obtains a Bleu score of 13.6, improved by 1.5 compared to the Sequence-to-Sequence model. In subjective evaluation, both the dialog policy and sentence generation model achieve a higher appropriateness and grammatical correctness scores than the baseline system. |
Databáze: | OpenAIRE |
Externí odkaz: |