Self-Training using a K-Nearest Neighbor as a Base Classifier Reinforced by Support Vector Machines
Autor: | Abdelatif Ennaji, M'bark Iggane, Mostafa El Yassa, Driss Mammass |
---|---|
Rok vydání: | 2012 |
Předmět: |
Training set
Computer science business.industry Supervised learning Pattern recognition Semi-supervised learning Machine learning computer.software_genre k-nearest neighbors algorithm Support vector machine ComputingMethodologies_PATTERNRECOGNITION Labeled data Artificial intelligence business Self training Classifier (UML) computer |
Zdroj: | International Journal of Computer Applications. 56:43-46 |
ISSN: | 0975-8887 |
DOI: | 10.5120/8899-2925 |
Popis: | In supervised learning, algorithms infer a general prediction model based on previously labeled data. However, in many real-world machine learning problems, the number of labeled data is small, while the unlabeled data are abundant. Obviously, the reliability of the learned model depends essentially on the size of the training set (labeled data). Indeed, if the amount of labeled data is not high enough, the generalization errors of learned model may be important. In such situation, semi supervised learning algorithm may improve the generalization performance of this model by integrating unlabeled data in the learning process. One of the most classical methods of the semi-supervised learning is the self-training. An advantage of this method is that several traditional supervised learning algorithms are used to build the model in the self-training process. In this paper, the k-Nearest Neighbors (k-NN) classifier was chosen in making decision during the self-training process. We also propose to reinforce self-training strategy by using a Support vector machines (SVM) classifier that can help the kNN to label the unlabeled data. Experimental results showed that Self-training based on k-NN and SVM can outperform the results with the Self-training based on k-NN classifier only. . |
Databáze: | OpenAIRE |
Externí odkaz: |