Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection
Autor: | Zhongzheng Ren, Jan Kautz, Yong Jae Lee, Alexander G. Schwing, Ming-Yu Liu, Xiaodong Yang, Zhiding Yu |
---|---|
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Computer science Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition Context (language use) 02 engineering and technology 010501 environmental sciences Machine learning computer.software_genre 01 natural sciences Machine Learning (cs.LG) Discriminative model FOS: Electrical engineering electronic engineering information engineering 0202 electrical engineering electronic engineering information engineering 0105 earth and related environmental sciences Ground truth business.industry Image and Video Processing (eess.IV) Supervised learning Electrical Engineering and Systems Science - Image and Video Processing Object (computer science) Object detection Benchmark (computing) 020201 artificial intelligence & image processing Artificial intelligence business computer |
Zdroj: | CVPR |
DOI: | 10.1109/cvpr42600.2020.01061 |
Popis: | Weakly supervised learning has emerged as a compelling tool for object detection by reducing the need for strong supervision during training. However, major challenges remain: (1) differentiation of object instances can be ambiguous; (2) detectors tend to focus on discriminative parts rather than entire objects; (3) without ground truth, object proposals have to be redundant for high recalls, causing significant memory consumption. Addressing these challenges is difficult, as it often requires to eliminate uncertainties and trivial solutions. To target these issues we develop an instance-aware and context-focused unified framework. It employs an instance-aware self-training algorithm and a learnable Concrete DropBlock while devising a memory-efficient sequential batch back-propagation. Our proposed method achieves state-of-the-art results on COCO ($12.1\% ~AP$, $24.8\% ~AP_{50}$), VOC 2007 ($54.9\% ~AP$), and VOC 2012 ($52.1\% ~AP$), improving baselines by great margins. In addition, the proposed method is the first to benchmark ResNet based models and weakly supervised video object detection. Code, models, and more details will be made available at: https://github.com/NVlabs/wetectron. CVPR 2020 |
Databáze: | OpenAIRE |
Externí odkaz: |