Self-Training using a K-Nearest Neighbor as a Base Classifier Reinforced by Support Vector Machines

Autor:	Abdelatif Ennaji, M'bark Iggane, Mostafa El Yassa, Driss Mammass
Rok vydání:	2012
Předmět:	Training set Computer science business.industry Supervised learning Pattern recognition Semi-supervised learning Machine learning computer.software_genre k-nearest neighbors algorithm Support vector machine ComputingMethodologies_PATTERNRECOGNITION Labeled data Artificial intelligence business Self training Classifier (UML) computer
Zdroj:	International Journal of Computer Applications. 56:43-46
ISSN:	0975-8887
DOI:	10.5120/8899-2925
Popis:	In supervised learning, algorithms infer a general prediction model based on previously labeled data. However, in many real-world machine learning problems, the number of labeled data is small, while the unlabeled data are abundant. Obviously, the reliability of the learned model depends essentially on the size of the training set (labeled data). Indeed, if the amount of labeled data is not high enough, the generalization errors of learned model may be important. In such situation, semi supervised learning algorithm may improve the generalization performance of this model by integrating unlabeled data in the learning process. One of the most classical methods of the semi-supervised learning is the self-training. An advantage of this method is that several traditional supervised learning algorithms are used to build the model in the self-training process. In this paper, the k-Nearest Neighbors (k-NN) classifier was chosen in making decision during the self-training process. We also propose to reinforce self-training strategy by using a Support vector machines (SVM) classifier that can help the kNN to label the unlabeled data. Experimental results showed that Self-training based on k-NN and SVM can outperform the results with the Self-training based on k-NN classifier only. .
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::7f0cd6ffd8c8c356378925f04cd3a78d https://doi.org/10.5120/8899-2925 Zobrazit plný text záznamu