Popis: |
Task-oriented dialogue systems are complex natural language applications employed in various fields such as health care, sales assistance, and digital customer servicing. Although the literature suggests several approaches to managing this type of dialogue system, only a few of them compares the performance of different techniques. From this perspective, in this paper we present a comparison between supervised learning, using the transformer architecture, and reinforcement learning using two flavors of Deep Q-Learning (DQN) algorithms. Our experiments use the MultiWOZ dataset and a real-world digital customer service dataset, from which we show that integrating expert pre-defined rules with DQN allows outperforming supervised approaches. Additionally, we also propose a method to make better usage of the designer knowledge by improving how interactions collected in warm-up are used in training phase. Our results indicate a reduction in training time by preserving the designer’s knowledge, expressed as pre-defined rules in memory during the initial steps of the DQN training procedure. |