Delay Mitigation for Backchannel Prediction in Spoken Dialog System

Autor: Takashi Sumiyoshi, Kenji Nagamatsu, Amalia Istiqlali Adiba, Dario Bertero, Takeshi Homma
Rok vydání: 2020
Předmět:
Zdroj: Lecture Notes in Electrical Engineering ISBN: 9789811583940
IWSDS
DOI: 10.1007/978-981-15-8395-7_10
Popis: To provide natural dialogues between spoken dialog systems and users, backchannel feedback can be used to make the interaction more sophisticated. Many related studies have combined acoustic and lexical features into a model to achieve better prediction. However, extracting lexical features leads to a delay caused by the automatic speech recognition (ASR) process. The systems should respond with no delay, since delays reduce the naturalness of the conversation and make the user feel dissatisfied. In this work, we present a prior prediction model for reducing response delay in backchannel prediction. We first train both acoustic- and lexical-based backchannel prediction models independently. In the lexical-based model, prior prediction is necessary to consider the ASR delay. The prior prediction model is trained with a weighting value that gradually increases when a sequence is closer to a suitable response timing. The backchannel probability is calculated based on the outputs from both acoustic- and lexical-based models. Evaluation results show that the prior prediction model can predict backchannel with an improvement rate on the F1 score 8% better than the current state-of-the-art algorithm under a 2.0-s delay condition.
Databáze: OpenAIRE