Active learning with extremely sparse labeled examples

Autor: David R. Hardoon, Shiliang Sun
Rok vydání: 2010
Předmět:
Zdroj: Neurocomputing. 73:2980-2988
ISSN: 0925-2312
DOI: 10.1016/j.neucom.2010.07.007
Popis: In the setting of active learning there exists a general assumption that labeled examples are available for training a classifier, which in turn is used to examine unlabeled data to select the most 'informative' examples for manual labeling. However, in some domain applications there are a limited number of labeled examples available, such as in the most extreme cases of having a single labeled example per category. In these scenarios, the most existing active learning methodologies cannot be directly applied without initially making an assumption on label assignment. In this paper we present a method for finding high-informative examples for manual labeling based on extremely limited labeled data available during training. We propose using canonical correlation analysis to investigate the correlation between different views of the available data and demonstrate that this measure can be used as a selection criterion for the novel application of active learning using only a single labeled example from each class. We demonstrate our method with promising experimental results on text classification, advertisement removal and multi-class image classification tasks.
Databáze: OpenAIRE