Sampling Method for Fast Training of Support Vector Data Description

Autor:	Arin Chaudhuri, Hansi Jiang, Deovrat Kakde, Sergiy Percdriy, Seunghyun Kong, Maria Jahja, Wei Xiao
Jazyk:	angličtina
Rok vydání:	2016
Předmět:	FOS: Computer and information sciences Computer Science - Machine Learning Computer science Computation Sampling (statistics) Machine Learning (stat.ML) computer.software_genre Statistics - Applications 030218 nuclear medicine & medical imaging Machine Learning (cs.LG) Set (abstract data type) Support vector machine 03 medical and health sciences Kernel (linear algebra) 0302 clinical medicine Statistics - Machine Learning 030220 oncology & carcinogenesis Data quality Convergence (routing) Anomaly detection Applications (stat.AP) Data mining computer
Popis:	Support Vector Data Description (SVDD) is a popular outlier detection technique, which constructs a flexible description of the input data. SVDD computation time is high for large training datasets, which limits its use in big-data process monitoring applications. We propose a new iterative, sampling-based method for SVDD training. The method incrementally learns the training data description at each iteration by computing SVDD on an independent random sample selected with replacement from the training data set. The experimental results indicate that the proposed method is extremely fast and provides a good quality data description.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6280b4e8c804146ba31a3a1daec903c5 http://arxiv.org/abs/1606.05382 Zobrazit plný text záznamu