Approximate l-Fold Cross-Validation with Least Squares SVM and Kernel Ridge Regression

Autor: Richard E. Edwards, Hao Zhang, Joshua Ryan New, Lynne E. Parker
Rok vydání: 2013
Předmět:
Zdroj: ICMLA (1)
Popis: Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers' scalability. However, Least Squares Support Vector Machines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyper parameters are still major problems. We address these problems by introducing an O(nlog n) approximate l-fold cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm's computational complexity and present empirical runtimes on data sets with approximately one million data points. We also validate our approximate method's effectiveness at selecting hyper parameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi level circulant kernel approximation to solve LS-SVM problems with hyper parameters selected using our method.
Databáze: OpenAIRE