Approximate l-Fold Cross-Validation with Least Squares SVM and Kernel Ridge Regression
Autor: | Richard E. Edwards, Hao Zhang, Joshua Ryan New, Lynne E. Parker |
---|---|
Rok vydání: | 2013 |
Předmět: |
Theoretical computer science
Computer science Regression analysis Least squares Cross-validation Kernel principal component analysis Support vector machine Kernel (linear algebra) Matrix (mathematics) Kernel method Kernel embedding of distributions Variable kernel density estimation Polynomial kernel Kernel (statistics) Least squares support vector machine Radial basis function kernel Principal component regression Total least squares |
Zdroj: | ICMLA (1) |
Popis: | Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers' scalability. However, Least Squares Support Vector Machines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyper parameters are still major problems. We address these problems by introducing an O(nlog n) approximate l-fold cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm's computational complexity and present empirical runtimes on data sets with approximately one million data points. We also validate our approximate method's effectiveness at selecting hyper parameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi level circulant kernel approximation to solve LS-SVM problems with hyper parameters selected using our method. |
Databáze: | OpenAIRE |
Externí odkaz: |