Application of large-scale L2-SVM for microarray classification
Autor: | Baole Han, Chuandong Qin, Baosheng Li |
---|---|
Rok vydání: | 2021 |
Předmět: |
Computer science
Computation Scale (descriptive set theory) Sample (statistics) Theoretical Computer Science Support vector machine Statistical classification ComputingMethodologies_PATTERNRECOGNITION Stochastic gradient descent Hardware and Architecture Hinge loss Convergence (routing) Algorithm Software Information Systems |
Zdroj: | The Journal of Supercomputing. 78:2265-2286 |
ISSN: | 1573-0484 0920-8542 |
DOI: | 10.1007/s11227-021-03962-7 |
Popis: | Traditional classification algorithms work well on general small-scale microarray datasets, but for large-scale scenarios, general machines are not capable of supporting the operation of these algorithms anymore for the memory and time costs. In this paper, we design a new application framework to perform the computation of at the fastest speed. First, the synthetic minority over-sampling technique is used to sample a few classes of sample for obtaining the balanced data. Then, a large-scale algorithm for $$L_{2}$$ -SVM based on the stochastic gradient descent method is proposed and used for microarray classification. Also, We give a simple proof of the convergence of stochastic gradient descent algorithm. Next, various large-scale algorithms for support vector machines are performed on the microarray datasets to identify the most appropriate algorithm. Finally, a comparative analysis of loss functions is done to clearly understand the differences. The experimental results show that the stochastic gradient descent algorithm and the squared hinge loss is an attractive choice, which can achieve high accuracy in seconds. |
Databáze: | OpenAIRE |
Externí odkaz: |