Combining Semi-supervision and Hubness to Enhance High-dimensional Data Clustering
Autor: | de Lima, Mateus C., Barioni, Maria Camila N., Razente, Humberto L. |
---|---|
Přispěvatelé: | CNPq, CAPES |
Jazyk: | angličtina |
Rok vydání: | 2017 |
Předmět: | |
Zdroj: | Journal of Information and Data Management; v. 8, n. 3 (2017): JOURNAL OF INFORMATION AND DATA MANAGEMENT; 223 Journal of Information and Data Management; Vol 8 No 3 (2017): JOURNAL OF INFORMATION AND DATA MANAGEMENT; 223 Journal of Information and Data Management; v. 8 n. 3 (2017): JOURNAL OF INFORMATION AND DATA MANAGEMENT; 223 |
ISSN: | 2178-7107 |
Popis: | The curse of dimensionality turns the high-dimensional data analysis a challenging task for data clustering techniques. Recent works have efficiently employed an aspect inherent to high-dimensional data in the proposal of clustering approaches guided by hubs which provide information about the distribution of the data instances among the K-nearest neighbors. Though, hubs can not well reflect the implicit data semantics, leading to an unsuitable data partition. In order to cope with both issues (i.e., high-dimensional data and meaningful clusters), this paper presents a clustering approach that explores the combination of two strategies: semi-supervision and density estimation based on hubness scores.The experimental results conducted with 23 real datasets show that the proposed approach has a superior performance when applied on datasets with different characteristics. |
Databáze: | OpenAIRE |
Externí odkaz: |