WikiContradiction: Detecting Self-Contradiction Articles on Wikipedia
Autor: | Hsu, Cheng, Li, Cheng-Te, Saez-Trumper, Diego, Hsu, Yi-Zhan |
---|---|
Rok vydání: | 2021 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | While Wikipedia has been utilized for fact-checking and claim verification to debunk misinformation and disinformation, it is essential to either improve article quality and rule out noisy articles. Self-contradiction is one of the low-quality article types in Wikipedia. In this work, we propose a task of detecting self-contradiction articles in Wikipedia. Based on the "self-contradictory" template, we create a novel dataset for the self-contradiction detection task. Conventional contradiction detection focuses on comparing pairs of sentences or claims, but self-contradiction detection needs to further reason the semantics of an article and simultaneously learn the contradiction-aware comparison from all pairs of sentences. Therefore, we present the first model, Pairwise Contradiction Neural Network (PCNN), to not only effectively identify self-contradiction articles, but also highlight the most contradiction pairs of contradiction sentences. The main idea of PCNN is two-fold. First, to mitigate the effect of data scarcity on self-contradiction articles, we pre-train the module of pairwise contradiction learning using SNLI and MNLI benchmarks. Second, we select top-K sentence pairs with the highest contradiction probability values and model their correlation to determine whether the corresponding article belongs to self-contradiction. Experiments conducted on the proposed WikiContradiction dataset exhibit that PCNN can generate promising performance and comprehensively highlight the sentence pairs the contradiction locates. Comment: Published at IEEE BigData 2021 (regular paper). Data and code can be access via: https://github.com/Wiki-Contradictory/Wiki-Self-Contradictory/ |
Databáze: | arXiv |
Externí odkaz: |