Annotated dataset for the fake news classification in Slovak language
Autor: | Viera Maslej-Kresnakova, Martin Sarnovsky, Nikola Hrabovska |
---|---|
Rok vydání: | 2020 |
Předmět: |
Information retrieval
Artificial neural network Computer science business.industry 05 social sciences 02 engineering and technology English language Crowdsourcing Field (computer science) language.human_language 0202 electrical engineering electronic engineering information engineering Task analysis language 020201 artificial intelligence & image processing Slovak Misinformation Fake news 0509 other social sciences 050904 information & library sciences business |
Zdroj: | 2020 18th International Conference on Emerging eLearning Technologies and Applications (ICETA). |
DOI: | 10.1109/iceta51985.2020.9379254 |
Popis: | Fake news detection currently presents an active field of research. Detection methods based on natural language processing and machine learning are being developed to automatically identify the possible misinformation contained within the news articles. To successfully train these models, annotated data are needed. In English language, multiple human-annotated datasets already are available and are being widely used in the research. The main objective of the work presented in this paper, was to create similar dataset consisting of articles in Slovak language. We collected the data from the various local news portals including reputable publishers as well as suspicious conspiratory portals. To obtain the annotations, we used crowdsourcing approach. Annotated dataset was used in preliminary experiments, in which neural network classifier was trained and evaluated. |
Databáze: | OpenAIRE |
Externí odkaz: |