Performance Analysis of Most Prominent Machine Learning and Deep Learning Algorithms In Classifying Bangla Crime News Articles

Autor:	Ariful Islam, Mun Yea Mahafi Taz Zahara, Salma Tabashum, Fahmida Naznin Fami, Md. Mamun Hossain
Rok vydání:	2020
Předmět:	Crime news business.industry Computer science Deep learning Machine learning computer.software_genre language.human_language Support vector machine Bengali Bag-of-words model language Word2vec Artificial intelligence tf–idf business Classifier (UML) Algorithm computer
Zdroj:	2020 IEEE Region 10 Symposium (TENSYMP).
DOI:	10.1109/tensymp50017.2020.9230785
Popis:	This work is dedicated to Bangla Crime Type Classification. As very few works had been done for Bangla crime classifier. To carry out this research, first we have developed a Bangla crime dataset which contains around 24,295 news articles and made most of them publicly available at github. Then we have built our crime classifier model and trained the classifier with our own dataset. We have analyzed word vectors like bag of words, TF-IDF in state-of-art machine learning algorithms as well as most promising semantic and syntactic word embeddings like Word2Vec, GloVe, fast-Text in both shallow and deep CNN and RNN to select best word embeddings for our classifier module. Finally we have summarized the experimental result in tabular form. We can see that significant improved accuracy can be achieved using deep learning algorithms over state-of-art machine learning algorithms in classifying Bangla crime data. The final experimental result shows that using shallow CNN with fastText,proposed model is able to achieve 93.70% accuracy.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::ae106feb625693292b2e9343cc89f9ca https://doi.org/10.1109/tensymp50017.2020.9230785 Zobrazit plný text záznamu