Span error bound for weighted SVM with applications in hyperparameter selection

Autor: Sarafis, Ioannis, Diou, Christos, Delopoulos, Anastasios
Rok vydání: 2018
Předmět:
Druh dokumentu: Working Paper
Popis: Weighted SVM (or fuzzy SVM) is the most widely used SVM variant owning its effectiveness to the use of instance weights. Proper selection of the instance weights can lead to increased generalization performance. In this work, we extend the span error bound theory to weighted SVM and we introduce effective hyperparameter selection methods for the weighted SVM algorithm. The significance of the presented work is that enables the application of span bound and span-rule with weighted SVM. The span bound is an upper bound of the leave-one-out error that can be calculated using a single trained SVM model. This is important since leave-one-out error is an almost unbiased estimator of the test error. Similarly, the span-rule gives the actual value of the leave-one-out error. Thus, one can apply span bound and span-rule as computationally lightweight alternatives of leave-one-out procedure for hyperparameter selection. The main theoretical contributions are: (a) we prove the necessary and sufficient condition for the existence of the span of a support vector in weighted SVM; and (b) we prove the extension of span bound and span-rule to weighted SVM. We experimentally evaluate the span bound and the span-rule for hyperparameter selection and we compare them with other methods that are applicable to weighted SVM: the $K$-fold cross-validation and the ${\xi}-{\alpha}$ bound. Experiments on 14 benchmark data sets and data sets with importance scores for the training instances show that: (a) the condition for the existence of span in weighted SVM is satisfied almost always; (b) the span-rule is the most effective method for weighted SVM hyperparameter selection; (c) the span-rule is the best predictor of the test error in the mean square error sense; and (d) the span-rule is efficient and, for certain problems, it can be calculated faster than $K$-fold cross-validation.
Comment: 35 pages, 6 figures
Databáze: arXiv