Distinguishing mislabeled data from correctly labeled data in classifier design

Autor: Ilya Muchnik, Dmitriy Fradkin, Dimitris N. Metaxas, Casimir A. Kulikowski, Sundara Venkataraman
Rok vydání: 2005
Předmět:
Zdroj: ICTAI
DOI: 10.1109/ictai.2004.52
Popis: We have developed a method for distinguishing between correctly labeled and mislabeled data sampled from video sequences and used in the construction of a facial expression recognition classifier. The novelty of our approach lies in training a single, optimal classifier type (a support vector machine, or SVM) on multiple representations of the data, involving different "discriminating" subspaces. Results of a preliminary study on the discrimination of "high stress" vs. "low stress" facial expression data by this method confirms that our novel approach is able to distinguish subproblems where labeling is highly reliable from those where mislabeling can lead to high error rates. In helping detect data subsamples which yield misleading classification results, the method is also a rapid, highly efficient cross-validated approach for eliminating outliers.
Databáze: OpenAIRE