Applying Efficient Selection Techniques of Unlabeled Instances for Wrapper-Based Semi-Supervised Methods

Autor:	Cephas A. S. Barreto, Arthur Costa Gorgonio, Joao C. Xavier-Junior, Anne Magaly De Paula Canuto
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	Artificial intelligence machine learning semi-supervised learning self-training semi-supervised method co-training semi-supervised method Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 10, Pp 43535-43551 (2022)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2022.3169498
Popis:	Semi-supervised learning (SSL) is a machine learning approach that integrates supervised and unsupervised learning mechanisms. This integration may be done in different ways and one possibility is to use a wrapper-based strategy. The main aim of a wrapper-based strategy is to use a small number of labelled instances to create a learning model. Then, this created model is used in a labelling process, where some unlabelled instances are labelled, and consequently, these instances are incorporated into the labelled set. One important aspect of a wrapper-based SSL method is the selection of unlabelled instances to be labelled in the labelling process. In other words, an efficient selection process plays an important role in the design of a wrapper-based SSL method since it can lead to an efficient labelling process, and in turn, the creation of efficient learning models. In this paper, we propose the use of three selection methods that can be applied to wrapper-based SSL methods. The main idea is to use two different selection criteria, prediction confidence or classification agreement with a distance metric, to perform an efficient selection of the unlabelled instances. In order to assess the feasibility of the proposed approach, the selection methods are applied in two well-known wrapper-based SSL methods, which are: Self-training and Co-training. Additionally, an empirical analysis will be conducted in which we compare the standard Self-training and Co-training methods against the proposed versions of these two SSL methods over 35 classification datasets.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/2b5bdc283e6144f5b0465c9fcd0b852f Zobrazit plný text záznamu View record in DOAJ