Fast k-NN Classifier for Documents Based on a Graph Structure.

Autor: Artigas-Fuentes, Fernando José, Gil-García, Reynaldo, Badía-Contelles, José Manuel, Pons-Porrata, Aurora
Zdroj: Progress in Pattern Recognition, Image Analysis, Computer Vision & Applications (9783642166860); 2010, p228-235, 8p
Abstrakt: In this paper, a fast k nearest neighbors (k-NN) classifier for documents is presented. Documents are usually represented in a high-dimensional feature space, where their terms are treated as features and the weight of each term reflects its importance in the document. There are many approaches to find the vicinity of an object, but their performance drastically decreases as the number of dimensions grows. This problem prevents its application for documents. The proposed method is based on a graph index structure with a fast search algorithm. Its high selectivity permits to obtain a similar classification quality than the exhaustive classifier, with a few number of computed distances. Our experimental results show that our method can be applied to problems of very high dimensionality, such as Text Mining. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index