Classifying Documents within Multiple Hierarchical Datasets using Multi-Task Learning

Autor: Huzefa Rangwala, Azad Naik, Anveshi Charuvaka
Rok vydání: 2017
Předmět:
FOS: Computer and information sciences
Computer science
Active learning (machine learning)
Competitive learning
Stability (learning theory)
Multi-task learning
Machine Learning (stat.ML)
02 engineering and technology
Semi-supervised learning
Machine learning
computer.software_genre
Machine Learning (cs.LG)
Multiclass classification
Inductive transfer
Statistics - Machine Learning
020204 information systems
0202 electrical engineering
electronic engineering
information engineering

Instance-based learning
Learning classifier system
Preference learning
business.industry
Algorithmic learning theory
Supervised learning
Online machine learning
Linear discriminant analysis
Generalization error
Computer Science - Learning
Binary classification
Unsupervised learning
020201 artificial intelligence & image processing
Artificial intelligence
Transfer of learning
business
computer
Zdroj: ICTAI
DOI: 10.48550/arxiv.1706.01583
Popis: Multi-task learning (MTL) is a supervised learning paradigm in which the prediction models for several related tasks are learned jointly to achieve better generalization performance. When there are only a few training examples per task, MTL considerably outperforms the traditional Single task learning (STL) in terms of prediction accuracy. In this work we develop an MTL based approach for classifying documents that are archived within dual concept hierarchies, namely, DMOZ and Wikipedia. We solve the multi-class classification problem by defining one-versus-rest binary classification tasks for each of the different classes across the two hierarchical datasets. Instead of learning a linear discriminant for each of the different tasks independently, we use a MTL approach with relationships between the different tasks across the datasets established using the non-parametric, lazy, nearest neighbor approach. We also develop and evaluate a transfer learning (TL) approach and compare the MTL (and TL) methods against the standard single task learning and semi-supervised learning approaches. Our empirical results demonstrate the strength of our developed methods that show an improvement especially when there are fewer number of training examples per classification task.
Comment: IEEE International Conference on Tools with Artificial Intelligence (ICTAI), 2013
Databáze: OpenAIRE