Clark Kent at SemEval-2019 Task 4: Stylometric Insights into Hyperpartisan News Detection
Autor: | Tanmoy Chakraborty, Viresh Gupta, Ramneek Kaur, Baani Leen Kaur Jolly |
---|---|
Rok vydání: | 2019 |
Předmět: |
Computer science
business.industry InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL 02 engineering and technology computer.software_genre CONTEST SemEval Task (project management) 03 medical and health sciences 0302 clinical medicine Character (mathematics) Test set 030221 ophthalmology & optometry 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing Word (computer architecture) |
Zdroj: | SemEval@NAACL-HLT Scopus-Elsevier |
DOI: | 10.18653/v1/s19-2159 |
Popis: | In this paper, we present a news bias prediction system, which we developed as part of a SemEval 2019 task. We developed an XGBoost based system which uses character and word level n-gram features represented using TF-IDF, count vector based correlation matrix, and predicts if an input news article is a hyperpartisan news article. Our model was able to achieve a precision of 68.3% on the test set provided by the contest organizers. We also run our model on the BuzzFeed corpus and find XGBoost with simple character level N-Gram embeddings to be performing well with an accuracy of around 96%. |
Databáze: | OpenAIRE |
Externí odkaz: |