A novel semi supervised approach for text classification
Autor: | Nirmalya Chowdhury, Debaditya Barman |
---|---|
Rok vydání: | 2018 |
Předmět: |
Computer Networks and Communications
Computer science business.industry Applied Mathematics Decision tree 020206 networking & telecommunications Pattern recognition Kohonen self organizing map Text document 02 engineering and technology Class (biology) Computer Science Applications Support vector machine Naive Bayes classifier ComputingMethodologies_PATTERNRECOGNITION Text categorization Computational Theory and Mathematics Artificial Intelligence Classifier (linguistics) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence Electrical and Electronic Engineering business Information Systems |
Zdroj: | International Journal of Information Technology. 12:1147-1157 |
ISSN: | 2511-2112 2511-2104 |
Popis: | Text categorization, also known as text classification is a supervised classification problem. It aims to assign a predefined class label or group to a new or unknown text document. Most of the time we need a collection of large data from each class to train the classifier. It may be noted that, it is very hard or expensive to collect labelled text data. In most cases we assign the label manually which is neither cost effective nor efficient. In this paper, we have introduced a semi-supervised classification approach where the learner needs very small amount of labelled data with a large amount of unlabeled data to assign a class label to a new or unknown text document. The proposed method uses Kohonen self organizing map (SOM) for labelling the unlabeled data and three classifiers namely support vector machine (SVM), Naive Bayes (NB), and decision tree (DT): classification and regression tree (CART) for observing the accuracy of classification. The experimental results obtained show the effectiveness of our proposed method. |
Databáze: | OpenAIRE |
Externí odkaz: |