Leveraging Text Representation and Face-head Tracking for Long-form Multimodal Semantic Relation Understanding
Autor: | Raksha Ramesh, Vishal Anand, Zifan Chen, Yifei Dong, Yun Chen, Ching-Yung Lin |
---|---|
Rok vydání: | 2022 |
Zdroj: | Proceedings of the 30th ACM International Conference on Multimedia. |
DOI: | 10.1145/3503161.3551610 |
Databáze: | OpenAIRE |
Externí odkaz: |