Predicting with Confidence: Classifiers that Know What They Don’t Know

Autor: Walker H. Land, J. David Schaffer
Rok vydání: 2017
Předmět:
Zdroj: Procedia Computer Science. 114:200-207
ISSN: 1877-0509
DOI: 10.1016/j.procs.2017.09.061
Popis: A useful feature for any predictive classifier is the ability to know when its predictions are unreliable. We present a general approach that should be applicable to any learning classifier system that has been trained on a set of known cases. The basic idea is simple, if for some of the training cases it knows its predictions are wrong, it can assess for any new case if it lies in the vicinity of one of these “trouble-makers.” The challenge is to quantify the degree of uncertainty and define a region of unreliability around each trouble-maker case. We provide specific algorithms to address these challenges and illustrate their use in the case of a GRNN oracle ensemble classifier that predicts the presence of Alzheimer’s disease from features extracted automatically from a sample of a person’s speech. One aspect of the challenge is that the distributions of training cases in the domain feature space is often quite poor because the training data sets are often feature-rich but case-poor. We show how the t-SNE algorithm can ameliorate this problem. We also provide an algorithm that can define a region of uncertainty based on linear interpolation of the error estimates among only those training cases that are “close enough.” No human input is needed.
Databáze: OpenAIRE