Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning

Autor: Ramazan Gokberk Cinbis, Jakob Verbeek, Cordelia Schmid
Přispěvatelé: Department of Computer Engineering, Universite Bilkent [Ankara], Bilkent University [Ankara]-Bilkent University [Ankara], Apprentissage de modèles à partir de données massives (Thoth ), Laboratoire Jean Kuntzmann (LJK ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria), ERC_Allegro, AXES, European Project: 320559,EC:FP7:ERC,ERC-2012-ADG_20120216,ALLEGRO(2013), European Project: 269980,EC:FP7:ICT,FP7-ICT-2009-6,AXES(2011), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Kuntzmann (LJK ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019]), OpenMETU
Rok vydání: 2016
Předmět:
FOS: Computer and information sciences
Iterative method
Computer science
Iterative methods
Object detection
Location
Experimental evaluation
Computer Vision and Pattern Recognition (cs.CV)
Supervised trainings
Computer Science - Computer Vision and Pattern Recognition
Fisher vector
Convolutional neural network
02 engineering and technology
Object localization
Artificial Intelligence
Minimum bounding box
0202 electrical engineering
electronic engineering
information engineering

Locks (fasteners)
Computer vision
Refinement methods
computer.programming_language
Learning systems
business.industry
Weakly supervised learning
Applied Mathematics
Multiple instance learning
Supervised learning
[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]
020207 software engineering
Pattern recognition
Pascal (programming language)
Object recognition
Visualization
Computational Theory and Mathematics
Localization accuracy
020201 artificial intelligence & image processing
Viola–Jones object detection framework
Computer Vision and Pattern Recognition
Artificial intelligence
business
computer
Software
Neural networks
Zdroj: IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2017, 39 (1), pp.189-203. ⟨10.1109/TPAMI.2016.2535231⟩
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (1), pp.189-203. ⟨10.1109/TPAMI.2016.2535231⟩
ISSN: 1939-3539
0162-8828
DOI: 10.1109/TPAMI.2016.2535231⟩
Popis: Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the object locations in the positive training images. Our main contribution is a multi-fold multiple instance learning procedure, which prevents training from prematurely locking onto erroneous object locations. This procedure is particularly important when using high-dimensional representations, such as Fisher vectors and convolutional neural network features. We also propose a window refinement method, which improves the localization accuracy by incorporating an objectness prior. We present a detailed experimental evaluation using the PASCAL VOC 2007 dataset, which verifies the effectiveness of our approach.
Comment: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
Databáze: OpenAIRE