Autor: |
Nahar, Khalid M. O., Alshtaiwi, Ma’moun, Alikhashashneh, Enas, Shatnawi, Nahlah, Al-Shannaq, Moy’awiah A., Abual-Rub, Mohammed, BaniIsmail, Basel |
Předmět: |
|
Zdroj: |
International Journal of Advances in Soft Computing & Its Applications; Mar2024, Vol. 16 Issue 1, p40-55, 16p |
Abstrakt: |
The process of plagiarism detection is one of the challenges in revealing the originality of a document, especially in the fields of science and research. Natural language processing methods can recognize and determine the level of similarity between different documents. In this paper, we tackle the task of extrinsic plagiarism detection based on semantic and syntactic approaches. The objective is to identify segments of a document that show strong similarity with a group of reference documents dealing with the same topic. In this paper, we present our hybrid approach that implements semantic and syntactic features based on Latent Dirichlet Allocation (LDA) and Wu & Plamer algorithm. The proposed approach has been evaluated on a PAN13 public dataset with a total accuracy of 85%. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|