Classifying Documents within Multiple Hierarchical Datasets using Multi-Task Learning
Autor: | Huzefa Rangwala, Azad Naik, Anveshi Charuvaka |
---|---|
Rok vydání: | 2017 |
Předmět: |
FOS: Computer and information sciences
Computer science Active learning (machine learning) Competitive learning Stability (learning theory) Multi-task learning Machine Learning (stat.ML) 02 engineering and technology Semi-supervised learning Machine learning computer.software_genre Machine Learning (cs.LG) Multiclass classification Inductive transfer Statistics - Machine Learning 020204 information systems 0202 electrical engineering electronic engineering information engineering Instance-based learning Learning classifier system Preference learning business.industry Algorithmic learning theory Supervised learning Online machine learning Linear discriminant analysis Generalization error Computer Science - Learning Binary classification Unsupervised learning 020201 artificial intelligence & image processing Artificial intelligence Transfer of learning business computer |
Zdroj: | ICTAI |
DOI: | 10.48550/arxiv.1706.01583 |
Popis: | Multi-task learning (MTL) is a supervised learning paradigm in which the prediction models for several related tasks are learned jointly to achieve better generalization performance. When there are only a few training examples per task, MTL considerably outperforms the traditional Single task learning (STL) in terms of prediction accuracy. In this work we develop an MTL based approach for classifying documents that are archived within dual concept hierarchies, namely, DMOZ and Wikipedia. We solve the multi-class classification problem by defining one-versus-rest binary classification tasks for each of the different classes across the two hierarchical datasets. Instead of learning a linear discriminant for each of the different tasks independently, we use a MTL approach with relationships between the different tasks across the datasets established using the non-parametric, lazy, nearest neighbor approach. We also develop and evaluate a transfer learning (TL) approach and compare the MTL (and TL) methods against the standard single task learning and semi-supervised learning approaches. Our empirical results demonstrate the strength of our developed methods that show an improvement especially when there are fewer number of training examples per classification task. Comment: IEEE International Conference on Tools with Artificial Intelligence (ICTAI), 2013 |
Databáze: | OpenAIRE |
Externí odkaz: |