An in-depth exploration of Bangla blog post classification
Autor: | Md. Ismail Jabiullah, Md. Tarek Habib, Md. Mehedee Zaman Khan, Ashik Iqbal Prince, Tanvirul Islam |
---|---|
Rok vydání: | 2021 |
Předmět: |
Control and Optimization
Computer Networks and Communications Computer science Bigram Decision tree 02 engineering and technology Machine learning computer.software_genre 01 natural sciences Unigram 010305 fluids & plasmas 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Computer Science (miscellaneous) Supervised machine learning Electrical and Electronic Engineering tf–idf Instrumentation business.industry Supervised learning TF-IDF Trigram Bangla text classification Perceptron Random forest Support vector machine ComputingMethodologies_PATTERNRECOGNITION Hardware and Architecture Control and Systems Engineering Bangla blog 020201 artificial intelligence & image processing Artificial intelligence business computer Information Systems |
Zdroj: | Bulletin of Electrical Engineering and Informatics. 10:742-749 |
ISSN: | 2302-9285 2089-3191 |
Popis: | Bangla blog is increasing rapidly in the era of information, and consequently, the blog has a diverse layout and categorization. In such an aptitude, automated blog post classification is a comparatively more efficient solution in order to organize Bangla blog posts in a standard way so that users can easily find their required articles of interest. In this research, nine supervised learning models which are Support Vector Machine (SVM), multinomial naïve Bayes (MNB), multi-layer perceptron (MLP), k-nearest neighbours (k-NN), stochastic gradient descent (SGD), decision tree, perceptron, ridge classifier and random forest are utilized and compared for classification of Bangla blog post. Moreover, the performance on predicting blog posts against eight categories, three feature extraction techniques are applied, namely unigram TF-IDF (term frequency-inverse document frequency), bigram TF-IDF, and trigram TF-IDF. The majority of the classifiers show above 80% accuracy. Other performance evaluation metrics also show good results while comparing the selected classifiers. |
Databáze: | OpenAIRE |
Externí odkaz: |