A pre-averaged pseudo nearest neighbor classifier

Autor: Dapeng Li
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: PeerJ Computer Science, Vol 10, p e2247 (2024)
Druh dokumentu: article
ISSN: 2376-5992
DOI: 10.7717/peerj-cs.2247
Popis: The k-nearest neighbor algorithm is a powerful classification method. However, its classification performance will be affected in small-size samples with existing outliers. To address this issue, a pre-averaged pseudo nearest neighbor classifier (PAPNN) is proposed to improve classification performance. In the PAPNN rule, the pre-averaged categorical vectors are calculated by taking the average of any two points of the training sets in each class. Then, k-pseudo nearest neighbors are chosen from the preprocessed vectors of every class to determine the category of a query point. The pre-averaged vectors can reduce the negative impact of outliers to some degree. Extensive experiments are conducted on nineteen numerical real data sets and three high dimensional real data sets by comparing PAPNN to other twelve classification methods. The experimental results demonstrate that the proposed PAPNN rule is effective for classification tasks in the case of small-size samples with existing outliers.
Databáze: Directory of Open Access Journals