MTP: A Dataset for Multi-Modal Turning Points in Casual Conversations

Autor: Ho, Gia-Bao Dinh, Tan, Chang Wei, Darban, Zahra Zamanzadeh, Salehi, Mahsa, Haffari, Gholamreza, Buntine, Wray
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: Detecting critical moments, such as emotional outbursts or changes in decisions during conversations, is crucial for understanding shifts in human behavior and their consequences. Our work introduces a novel problem setting focusing on these moments as turning points (TPs), accompanied by a meticulously curated, high-consensus, human-annotated multi-modal dataset. We provide precise timestamps, descriptions, and visual-textual evidence high-lighting changes in emotions, behaviors, perspectives, and decisions at these turning points. We also propose a framework, TPMaven, utilizing state-of-the-art vision-language models to construct a narrative from the videos and large language models to classify and detect turning points in our multi-modal dataset. Evaluation results show that TPMaven achieves an F1-score of 0.88 in classification and 0.61 in detection, with additional explanations aligning with human expectations.
Comment: Accepted by ACL 2024 main conference
Databáze: arXiv