Detecting Human-Object Interaction with Mixed Supervision
Autor: | Miaojing Shi, Suresh Kirthi Kumaraswamy, Ewa Kijak |
---|---|
Přispěvatelé: | Creating and exploiting explicit links between multimedia fragments (LinkMedia), MEDIA ET INTERACTIONS (IRISA-D6), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria), King‘s College London, Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), This work was partially supported by the READ-IT project, funded by the JPI Cultural Heritage under the European Union Horizon 2020 R&I program (grant agreement No. 699523). Miaojing Shi was supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61828602., Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-MEDIA ET INTERACTIONS (IRISA-D6), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique) |
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
Computer science business.industry Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition [INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] 02 engineering and technology 010501 environmental sciences Object (computer science) 01 natural sciences Pipeline (software) Image (mathematics) Task (project management) [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] Annotation ComputingMethodologies_PATTERNRECOGNITION Bounding overwatch Robustness (computer science) [INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence State (computer science) business 0105 earth and related environmental sciences |
Zdroj: | WACV 2021-Winter Conference on Applications of Computer Vision WACV 2021-Winter Conference on Applications of Computer Vision, Jan 2021, Waikoloa / Virtual, United States. pp.1-10 WACV |
DOI: | 10.48550/arxiv.2011.04971 |
Popis: | Human object interaction (HOI) detection is an important task in image understanding and reasoning. It is in a form of HOI triplet , requiring bounding boxes for human and object, and action between them for the task completion. In other words, this task requires strong supervision for training that is however hard to procure. A natural solution to overcome this is to pursue weakly-supervised learning, where we only know the presence of certain HOI triplets in images but their exact location is unknown. Most weakly-supervised learning methods do not make provision for leveraging data with strong supervision, when they are available; and indeed a na\"ive combination of this two paradigms in HOI detection fails to make contributions to each other. In this regard we propose a mixed-supervised HOI detection pipeline: thanks to a specific design of momentum-independent learning that learns seamlessly across these two types of supervision. Moreover, in light of the annotation insufficiency in mixed supervision, we introduce an HOI element swapping technique to synthesize diverse and hard negatives across images and improve the robustness of the model. Our method is evaluated on the challenging HICO-DET dataset. It performs close to or even better than many fully-supervised methods by using a mixed amount of strong and weak annotations; furthermore, it outperforms representative state of the art weakly and fully-supervised methods under the same supervision. Comment: WACV 2021 - camera ready |
Databáze: | OpenAIRE |
Externí odkaz: |