Confirm or refute?: A comparative study on citation sentiment classification in clinical research publications
Autor: | Zeshan Peng, Shabnam Tafreshi, Graciela Rosemblat, Halil Kilicoglu, Tung Tran, Jodi Schneider |
---|---|
Rok vydání: | 2019 |
Předmět: |
Biomedical Research
Computer science Health Informatics Bibliometrics Machine learning computer.software_genre Convolutional neural network Article Machine Learning 03 medical and health sciences 0302 clinical medicine Citation analysis 030212 general & internal medicine 030304 developmental biology Publishing 0303 health sciences Artificial neural network business.industry Sentiment analysis Computer Science Applications Support vector machine Categorization Artificial intelligence business Citation computer Algorithms |
Zdroj: | J Biomed Inform |
ISSN: | 1532-0464 |
DOI: | 10.1016/j.jbi.2019.103123 |
Popis: | Quantifying scientific impact of researchers and journals relies largely on citation counts, despite the acknowledged limitations of this approach. The need for more suitable alternatives has prompted research into developing advanced metrics, such as h-index and Relative Citation Ratio (RCR), as well as better citation categorization schemes to capture the various functions that citations serve in a publication. One such scheme involves citation sentiment: whether a reference paper is cited positively (agreement with the findings of the reference paper), negatively (disagreement), or neutrally. The ability to classify citation function in this manner can be viewed as a first step toward a more fine-grained bibliometrics. In this study, we compared several approaches, varying in complexity, for classification of citation sentiment in clinical trial publications. Using a corpus of 285 discussion sections from as many publications (a total of 4,182 citations), we developed a rule-based method as well as supervised machine learning models based on support vector machines (SVM) and two variants of deep neural networks; namely, convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM). A CNN model augmented with hand-crafted features yielded the best performance (0.882 accuracy and 0.721 macro- F 1 on held-out set). Our results show that baseline performances of traditional supervised learning algorithms and deep neural network architectures are similar and that hand-crafted features based on sentiment dictionaries and rhetorical structure allow neural network approaches to outperform traditional machine learning approaches for this task. We make the rule-based method and the best-performing neural network model publicly available at: https://github.com/kilicogluh/clinical-citation-sentiment . |
Databáze: | OpenAIRE |
Externí odkaz: |