Using Sentence Similarity Measure for Plagiarism Detection of Arabic Documents

Autor: Wafa Wali, Abdelmajid Ben Hamadou, Bilel Gargouri
Rok vydání: 2018
Předmět:
Zdroj: Advances in Intelligent Systems and Computing ISBN: 9783319763477
ISDA
DOI: 10.1007/978-3-319-76348-4_6
Popis: Plagiarism detection it is a challenging task, particularly in natural language texts. Some plagiarism detection tools have been developed for diverse natural languages, especially English. In this paper, we propose, a new plagiarism detection system devoted to Arabic text documents. This system is based on an algorithm that uses a semantic sentence similarity measure. Indeed, the sentence similarity measure aggregates in a linear function between three components: the lexical-based LS including the common words, the semantic-based SS using the synonymy relationships, and the syntactico-semantic- based SSS semantic arguments properties notably semantic argument and thematic role. It measures the semantic similarity between words that play the same syntactic role. Concerning the word-based semantic similarity, an information content-based measure is used to estimate the SS degree between words by exploiting the LMF Arabic standardized dictionary ElMadar. The performance of the proposed system was confirmed through experiments with student thesis reports that promising capabilities in identifying literal and some types of intelligent plagiarism. We also demonstrate its advantages over other plagiarism detection tools, including Aplag.
Databáze: OpenAIRE