ARNLI: ARABIC NATURAL LANGUAGE INFERENCE ENTAILMENT AND CONTRADICTION DETECTION.

Autor: AL JALLAD, KHLOUD, GHNEIM, NADA
Předmět:
Zdroj: Computer Science; 2023, Vol. 24 Issue 2, p187-209, 23p
Abstrakt: Natural language inference (NLI) is a hot research topic in natural language processing; contradiction-detection between sentences is a special case of NLI. This is considered to be a difficult NLP task that has a significant influence when added as a component in many NLP applications (such as questionanswering systems and text summarization). The Arabic language is one of the most challenging low-resource languages for detecting contradictions due to its rich lexical semantic ambiguity. We have created a data set of more than 12k sentences and named it ArNLI; it will be publicly available. Moreover, we have applied a new model that was inspired by Stanford's proposed contradictiondetection solutions for the English language. We proposed an approach for detecting contradictions between pairs of sentences in the Arabic language using a contradiction vector combined with a language model vector as an input to a machine-learning model. We analyzed the results of different traditional machine-learning classifiers and compared their results on our created data set (ArNLI) and on the automatic translation of both the PHEME and SICK English data sets. The best results were achieved by using the random forest classifier, with accuracies of 0.99, 0.60 and 0.75 on PHEME, SICK, and ArNLI respectively. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index