Self-Supervised Learning of Person-Specific Facial Dynamics for Automatic Personality Recognition
Autor: | Siyang Song, Georgios Tzimiropoulos, Linlin Shen, Shashank Jaiswal, Michel Valstar, Enrique Sanchez |
---|---|
Rok vydání: | 2023 |
Předmět: |
business.industry
Computer science media_common.quotation_subject Rank (computer programming) Machine learning computer.software_genre Facial recognition system Human-Computer Interaction Face (geometry) Task analysis Personality Artificial intelligence Big Five personality traits business Set (psychology) Representation (mathematics) computer Software media_common |
Zdroj: | IEEE Transactions on Affective Computing. 14:178-195 |
ISSN: | 2371-9850 |
DOI: | 10.1109/taffc.2021.3064601 |
Popis: | This paper aims to solve two important issues that frequently occur in existing automatic personality analysis systems: 1. Attempting to use very short video segments or even single frames to infer personality traits; 2. Lack of methods to encode person-specific facial dynamics for personality recognition. Hence, we proposes a novel Rank Loss which utilizes the natural temporal evolution of facial actions, rather than personality labels, for self-supervised learning of facial dynamics. Our approach first trains a generic U-net model that can infer general facial dynamics learned from unlabelled face videos. Then, the generic model is frozen, and a set of intermediate filters are incorporated into this architecture. The self-supervised learning is then resumed with only person-specific videos. This way, the learned filters' weights are person-specific, making them a valuable source for modeling person-specific facial dynamics. We then concatenate the weights of the learned filters as a person-specific representation, which can be directly used to predict the personality traits without needing other parts of the network. We evaluate the proposed approach on both self-reported personality and apparent personality datasets. Besides achieving promising results in personality trait estimation from videos, we show that fusion of tasks reaches highest accuracy, and that multi-scale dynamics are more informative than single-scale dynamics. |
Databáze: | OpenAIRE |
Externí odkaz: |