SEntFiN 1.0: Entity‐aware sentiment analysis for financial news
Autor: | Ankur Sinha, Satishwar Kedas, Rishu Kumar, Pekka Malo |
---|---|
Rok vydání: | 2022 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Artificial Intelligence (cs.AI) Computer Science - Computation and Language Information Systems and Management Computer Science - Artificial Intelligence Computer Networks and Communications I.2.7 Library and Information Sciences Computation and Language (cs.CL) Machine Learning (cs.LG) Information Systems |
Zdroj: | Journal of the Association for Information Science and Technology. |
ISSN: | 2330-1643 2330-1635 |
Popis: | Fine-grained financial sentiment analysis on news headlines is a challenging task requiring human-annotated datasets to achieve high performance. Limited studies have tried to address the sentiment extraction task in a setting where multiple entities are present in a news headline. In an effort to further research in this area, we make publicly available SEntFiN 1.0, a human-annotated dataset of 10,753 news headlines with entity-sentiment annotations, of which 2,847 headlines contain multiple entities, often with conflicting sentiments. We augment our dataset with a database of over 1,000 financial entities and their various representations in news media amounting to over 5,000 phrases. We propose a framework that enables the extraction of entity-relevant sentiments using a feature-based approach rather than an expression-based approach. For sentiment extraction, we utilize 12 different learning schemes utilizing lexicon-based and pre-trained sentence representations and five classification approaches. Our experiments indicate that lexicon-based n-gram ensembles are above par with pre-trained word embedding schemes such as GloVe. Overall, RoBERTa and finBERT (domain-specific BERT) achieve the highest average accuracy of 94.29% and F1-score of 93.27%. Further, using over 210,000 entity-sentiment predictions, we validate the economic effect of sentiments on aggregate market movements over a long duration. 32 Pages |
Databáze: | OpenAIRE |
Externí odkaz: |