Weakly supervised structured output learning for semantic segmentation

Autor: Vittorio Ferrari, Christian Leistner, Javier Civera, Alessandro Prest, Cordelia Schmid
Jazyk: angličtina
Rok vydání: 2012
Předmět:
Optimization
Domain adaptation
Computer science
optimisation
Gaussian processes
02 engineering and technology
hard optimization problem
0202 electrical engineering
electronic engineering
information engineering

Bayesian optimization problem
SIFT-flow dataset
Training
Computer vision
appearance model
Gaussian process
extremely randomized hashing forest
Visualization
Image segmentation
Measurement
multiple visual cues
business.industry
diverse superpixel feature
020206 networking & telecommunications
maximum expected agreement model selection
Object detection
Bayes methods
semantic segmentation
Active appearance model
Semantics
Kernel
Pattern recognition (psychology)
sparse binary vector
weakly supervised structured output learning
020201 artificial intelligence & image processing
learning (artificial intelligence)
Artificial intelligence
business
Zdroj: Vezhnevets, A, Ferrari, V & Buhmann, J M 2012, Weakly supervised structured output learning for semantic segmentation . in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on . Institute of Electrical and Electronics Engineers (IEEE), pp. 845-852 . https://doi.org/10.1109/CVPR.2012.6247757
DOI: 10.1109/CVPR.2012.6247757
Popis: We address the problem of weakly supervised semantic segmentation. The training images are labeled only by the classes they contain, not by their location in the image. On test images instead, the method must predict a class label for every pixel. Our goal is to enable segmentation algorithms to use multiple visual cues in this weakly supervised setting, analogous to what is achieved by fully supervised methods. However, it is difficult to assess the relative usefulness of different visual cues from weakly supervised training data. We define a parametric family of structured models, were each model weights visual cues in a different way. We propose a Maximum Expected Agreement model selection principle that evaluates the quality of a model from the family without looking at superpixel labels. Searching for the best model is a hard optimization problem, which has no analytic gradient and multiple local optima. We cast it as a Bayesian optimization problem and propose an algorithm based on Gaussian processes to efficiently solve it. Our second contribution is an Extremely Randomized Hashing Forest that represents diverse superpixel features as a sparse binary vector. It enables using appearance models of visual classes that are fast at training and testing and yet accurate. Experiments on the SIFT-flow dataset show a significant improvement over previous weakly supervised methods and even over some fully supervised methods.
Databáze: OpenAIRE