Semi-supervised Clustering Framework Based on Active Learning for Real Data

Autor: Odate Ryosuke, Masahiro Motobayashi, Suzuki Yasufumi, Hiroshi Shinjo
Rok vydání: 2018
Předmět:
Zdroj: Lecture Notes in Computer Science ISBN: 9783319977843
S+SSPR
DOI: 10.1007/978-3-319-97785-0_18
Popis: In this paper, we propose a real data clustering method based on active learning. Clustering methods are difficult to apply to real data for two reasons. First, real data may include outliers that adversely affect clustering. Second, the clustering parameters such as the number of clusters cannot be made constant because the number of classes of real data may increase as time goes by. To solve the first problem, we focus on labeling outliers. Therefore, we develop a stream-based active learning framework for clustering. The active learning framework enables us to label the outliers intensively. To solve the second problem, we also develop an algorithm to automatically set clustering parameters. This algorithm can automatically set the clustering parameters with some labeled samples. The experimental results show that our method can deal with the problems mentioned above better than the conventional clustering methods.
Databáze: OpenAIRE