Sentence-Level Propaganda Detection in News Articles with Transfer Learning and BERT-BiLSTM-Capsule Model

Autor: Dumitru-Clementin Cercel, Cristian Onose, Mircea-Adrian Tanase, George-Alexandru Vlad
Rok vydání: 2019
Předmět:
Zdroj: Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda.
DOI: 10.18653/v1/d19-5022
Popis: In recent years, the need for communication increased in online social media. Propaganda is a mechanism which was used throughout history to influence public opinion and it is gaining a new dimension with the rising interest of online social media. This paper presents our submission to NLP4IF-2019 Shared Task SLC: Sentence-level Propaganda Detection in news articles. The challenge of this task is to build a robust binary classifier able to provide corresponding propaganda labels, propaganda or non-propaganda. Our model relies on a unified neural network, which consists of several deep leaning modules, namely BERT, BiLSTM and Capsule, to solve the sentencelevel propaganda classification problem. In addition, we take a pre-training approach on a somewhat similar task (i.e., emotion classification) improving results against the cold-start model. Among the 26 participant teams in the NLP4IF-2019 Task SLC, our solution ranked 12th with an F1-score 0.5868 on the official test data. Our proposed solution indicates promising results since our system significantly exceeds the baseline approach of the organizers by 0.1521 and is slightly lower than the winning system by 0.0454.
Databáze: OpenAIRE