A Fast Two-Level Approximate Euclidean Minimum Spanning Tree Algorithm for High-Dimensional Data
Autor: | Xia Li Wang, Xiaochun Wang, Xiaqiong Li |
---|---|
Rok vydání: | 2018 |
Předmět: |
Clustering high-dimensional data
Computational complexity theory Computer science Nearest neighbor search Boundary (topology) Scale (descriptive set theory) 02 engineering and technology Minimum spanning tree Data set 020204 information systems Euclidean minimum spanning tree 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Algorithm |
Zdroj: | Machine Learning and Data Mining in Pattern Recognition ISBN: 9783319961323 MLDM (2) |
DOI: | 10.1007/978-3-319-96133-0_21 |
Popis: | Euclidean minimum spanning tree algorithms run typically with quadratic computational complexity, which is not practical for large scale high dimensional datasets. In this paper, we propose a new two-level approximate Euclidean minimum spanning tree algorithm for high dimensional data. In the first level, we perform outlier detection for a given data set to identify a small amount of boundary points and run standard Prim’s algorithm on the reduced dataset. In the second level, we conduct a k-nearest neighbors search to complete an approximate Euclidean Minimum Spanning Tree construction process. Experimental results on sample data sets demonstrate the efficiency of the proposed method while keeping high approximate precision. |
Databáze: | OpenAIRE |
Externí odkaz: |