Sampling Method for Fast Training of Support Vector Data Description
Autor: | Arin Chaudhuri, Hansi Jiang, Deovrat Kakde, Sergiy Percdriy, Seunghyun Kong, Maria Jahja, Wei Xiao |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Computer science Computation Sampling (statistics) Machine Learning (stat.ML) computer.software_genre Statistics - Applications 030218 nuclear medicine & medical imaging Machine Learning (cs.LG) Set (abstract data type) Support vector machine 03 medical and health sciences Kernel (linear algebra) 0302 clinical medicine Statistics - Machine Learning 030220 oncology & carcinogenesis Data quality Convergence (routing) Anomaly detection Applications (stat.AP) Data mining computer |
Popis: | Support Vector Data Description (SVDD) is a popular outlier detection technique, which constructs a flexible description of the input data. SVDD computation time is high for large training datasets, which limits its use in big-data process monitoring applications. We propose a new iterative, sampling-based method for SVDD training. The method incrementally learns the training data description at each iteration by computing SVDD on an independent random sample selected with replacement from the training data set. The experimental results indicate that the proposed method is extremely fast and provides a good quality data description. |
Databáze: | OpenAIRE |
Externí odkaz: |