Kernel Domain Description with Incomplete Data: Using Instance-Specific Margins to Avoid Imputation

Autor: Adam Gripton, Weiping Lu
Rok vydání: 2010
Předmět:
Zdroj: ICPR
DOI: 10.1109/icpr.2010.716
Popis: We present a method of performing kernel space domain description of a dataset with incomplete entries without the need for imputation, allowing kernel features of a class of data with missing features to be rigorously described. This addresses the problem that absent data completion is usually required before kernel classifiers, such as support vector domain description (SVDD), can be applied; equally, few existing techniques for incomplete data adequately address the issue of kernel spaces. Our method, which we call instance-specific domain description (ISDD), uses a parametrisation framework to compute minimal kernelised distances between data points with missing features through a series of optimisation runs, allowing evaluation of the kernel distance while avoiding subjective completions of missing data. We compare results of our method against those achieved by SVDD applied to an imputed dataset, using synthetic and experimental datasets where feature absence has a non-trivial structure. We show that our methods can achieve tighter sphere bounds when applied to linear and quadratic kernels.
Databáze: OpenAIRE