Semi-supervised genetic programming for classification

Autor: Filipe de Lima Arcanjo, Paulo Viana Bicalho, Gisele L. Pappa, Altigran Soares da Silva, Wagner Meira
Rok vydání: 2011
Předmět:
Zdroj: GECCO
DOI: 10.1145/2001576.2001746
Popis: Learning from unlabeled data provides innumerable advantages to a wide range of applications where there is a huge amount of unlabeled data freely available. Semi-supervised learning, which builds models from a small set of labeled examples and a potential large set of unlabeled examples, is a paradigm that may effectively use those unlabeled data. Here we propose KGP, a semi-supervised transductive genetic programming algorithm for classification. Apart from being one of the first semi-supervised algorithms, it is transductive (instead of inductive), i.e., it requires only a training dataset with labeled and unlabeled examples, which should represent the complete data domain. The algorithm relies on the three main assumptions on which semi-supervised algorithms are built, and performs both global search on labeled instances and local search on unlabeled instances. Periodically, unlabeled examples are moved to the labeled set after a weighted voting process performed by a committee. Results on eight UCI datasets were compared with Self-Training and KNN, and showed KGP as a promising method for semi-supervised learning.
Databáze: OpenAIRE