Kinetics and Scene Features for Intent Detection

Autor: Ching-Yung Lin, Serena Yuan, Tianle Zhu, Ziyin Wang, Vishal Anand, Wenfeng Lyu, Raksha Ramesh
Rok vydání: 2020
Předmět:
Zdroj: ICMI Companion
DOI: 10.1145/3395035.3425641
Popis: We create multi-modal fusion models to predict relational classes within entities in free-form inputs such as unseen movies. Our approach identifies information rich features within individual sources -- emotion, text-attention, age, gender, and contextual background object tracking. These information are absorbed and contrasted from baseline fusion architectures. These five models then showcase future research areas on this challenging problem of relational knowledge extraction from movies and free-form multi-modal input sources. We find that, generally, the Kinetics model added with Attributes and Objects beat the baseline models.
Databáze: OpenAIRE