Semi-Supervised Text Classification With Universum Learning

Autor:	Chia-Hoang Lee, Chien-Liang Liu, Wen-Hoar Hsaio, Tsung-Hsun Kuo, Tao-Hsing Chang
Rok vydání:	2016
Předmět:	Normalization (statistics) 0209 industrial biotechnology Boosting (machine learning) Computer science 02 engineering and technology Machine learning computer.software_genre 020901 industrial engineering & automation Classifier (linguistics) Prior probability 0202 electrical engineering electronic engineering information engineering AdaBoost Electrical and Electronic Engineering Cluster analysis business.industry Pattern recognition Computer Science Applications Human-Computer Interaction Support vector machine Control and Systems Engineering 020201 artificial intelligence & image processing Algorithm design Artificial intelligence business Classifier (UML) computer Software Information Systems
Zdroj:	IEEE Transactions on Cybernetics. 46:462-473
ISSN:	2168-2275 2168-2267
DOI:	10.1109/tcyb.2015.2403573
Popis:	Universum, a collection of nonexamples that do not belong to any class of interest, has become a new research topic in machine learning. This paper devises a semi-supervised learning with Universum algorithm based on boosting technique, and focuses on situations where only a few labeled examples are available. We also show that the training error of AdaBoost with Universum is bounded by the product of normalization factor, and the training error drops exponentially fast when each weak classifier is slightly better than random guessing. Finally, the experiments use four data sets with several combinations. Experimental results indicate that the proposed algorithm can benefit from Universum examples and outperform several alternative methods, particularly when insufficient labeled examples are available. When the number of labeled examples is insufficient to estimate the parameters of classification functions, the Universum can be used to approximate the prior distribution of the classification functions. The experimental results can be explained using the concept of Universum introduced by Vapnik, that is, Universum examples implicitly specify a prior distribution on the set of classification functions.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::33ffd1fbb09ed459c9a04ce4a689360b https://doi.org/10.1109/tcyb.2015.2403573 Zobrazit plný text záznamu