Comprehension of polarity of articles by citation sentiment analysis using TF-IDF and ML classifiers

Autor: Musarat Karim, Malik Muhammad Saad Missen, Muhammad Umer, Alisha Fida, Ala’ Abdulmajid Eshmawi, Abdullah Mohamed, Imran Ashraf
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: PeerJ Computer Science, Vol 8, p e1107 (2022)
Druh dokumentu: article
ISSN: 2376-5992
DOI: 10.7717/peerj-cs.1107
Popis: Sentiment analysis has been researched extensively during the last few years, however, the sentiment analysis of citations in a research article is an unexplored research area. Sentiment analysis of citations can provide new applications in bibliometrics and provide insights for a better understanding of scientific knowledge. Citation count, as it is used today to measure the quality of a paper, does not portray the quality of a scientific article, as the article may be cited to indicate its weakness. So determining the polarity of a citation is an important task to quantify the quality of the cited article and ascertain its impact and ranking. This article presents an approach to determine the polarity of the cited article using term frequency-inverse document frequency and machine learning classifiers. To analyze the influence of an imbalanced dataset, several experiments are performed with and without the synthetic minority oversampling technique (SMOTE) and uni-gram and bi-gram term frequency-inverse document frequency (TF-IDF). Results indicate that the proposed methodology achieves high accuracy of 99.0% with the extra tree classifier when trained on SMOTE oversampled dataset and bi-gram features.
Databáze: Directory of Open Access Journals