Learning Compositional Models for Object Categories from Small Sample Sets
Autor: | Song-Chun Zhu, Jake Porway, Benjamin Yao |
---|---|
Rok vydání: | 2009 |
Předmět: |
Kullback–Leibler divergence
Markov random field Computer science business.industry Cognitive neuroscience of visual object recognition Machine learning computer.software_genre Hierarchical database model Spatial relation Categorization Stochastic context-free grammar Entropy (information theory) Artificial intelligence business computer |
DOI: | 10.1017/cbo9780511635465.014 |
Popis: | In this chapter we present a method for learning a compositional model in a minimax entropy framework for modeling object categories with large intra-class variance. The model we learn incorporates the flexibility of a stochastic context free grammar (SCFG) to account for the variation in object structure with the neighborhood constraints of a Markov random field (MRF) to enforce spatial context. We learn the model through a generalized minimax entropy framework that accounts for the dynamic structure of the hierarchical model. We first learn the SCFG parameters using the frequencies of object parts, then pursue spatial relations in order of greatest information gain. The learned model can generalize from a small set of training samples (n < 100) to generate a combinatorially large number of novel instances using stochastic sampling. To verify our learning method and model performance, we present plots of KL divergence minimization as the algorithm proceeds, and show that samples from the model become more realistic as more spatial relations are added. We also show the model accurately predicting missing or undetected parts for top-down recognition along with preliminary results showing that the model can learn a large space of category appearances from a very small (n < 15) number of training samples. This process is similar to “recognition-by-components”, a theory that postulates that biological vision systems recognize objects as composed from a dictionary of commonly appearing 3D structures. Finally, we discuss a compositional boosting algorithm for inference and show examples using it for object recognition. This article is a chapter from the book Object Categorization: Computer and Human Vision Perspectives, edited by Sven Dickinson, Ales Leonardis, Bernt Schiele, and Michael J. Tarr (Cambridge University Press). University of California Los Angeles, Los Angeles, CA. Lotus Hill Research Institute, EZhou, China. |
Databáze: | OpenAIRE |
Externí odkaz: |