Abstrakt: |
Image data stream classification is in high demand and can be used in various contexts, such as public security, medicine, and remote sensing. Despite the research effort in this field, several challenges must be addressed. Among them are the emergence of new classes (concept evolution), the evolution of already-known concepts (concept drift), and the high dimensionality of the data, which can cause the curse of dimensionality. This article presents the HubISC algorithm, which considers using hubs and forgetting strategies to deal with the challenges above for image data stream classification. Hubness is an inherent property of high-dimensional data. The aim of exploring this aspect in the work described here was twofold: firstly, to summarize data class instances, and secondly, to select data instances in the active learning step. The HubISC algorithm was evaluated using an evaluation method framework specially designed to support the analysis of diverse challenges inherent to image data stream classification. The results of the extensive experiments demonstrate that HubISC presents a good trade-off between the efficacy and the percentage of labeled instances required when compared to commonly used algorithms for image data stream classification. |