Application of large-scale L2-SVM for microarray classification

Autor:	Baole Han, Chuandong Qin, Baosheng Li
Rok vydání:	2021
Předmět:	Computer science Computation Scale (descriptive set theory) Sample (statistics) Theoretical Computer Science Support vector machine Statistical classification ComputingMethodologies_PATTERNRECOGNITION Stochastic gradient descent Hardware and Architecture Hinge loss Convergence (routing) Algorithm Software Information Systems
Zdroj:	The Journal of Supercomputing. 78:2265-2286
ISSN:	1573-0484 0920-8542
DOI:	10.1007/s11227-021-03962-7
Popis:	Traditional classification algorithms work well on general small-scale microarray datasets, but for large-scale scenarios, general machines are not capable of supporting the operation of these algorithms anymore for the memory and time costs. In this paper, we design a new application framework to perform the computation of at the fastest speed. First, the synthetic minority over-sampling technique is used to sample a few classes of sample for obtaining the balanced data. Then, a large-scale algorithm for $$L_{2}$$ -SVM based on the stochastic gradient descent method is proposed and used for microarray classification. Also, We give a simple proof of the convergence of stochastic gradient descent algorithm. Next, various large-scale algorithms for support vector machines are performed on the microarray datasets to identify the most appropriate algorithm. Finally, a comparative analysis of loss functions is done to clearly understand the differences. The experimental results show that the stochastic gradient descent algorithm and the squared hinge loss is an attractive choice, which can achieve high accuracy in seconds.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::e66d44b82e0cdd592e16819ccc298e08 https://doi.org/10.1007/s11227-021-03962-7 Zobrazit plný text záznamu Full text from SpringerLink