An Advance Approach for Spam Document Detection Using QAP Rabin-Karp Algorithm

Autor: Abhigyan Tiwary, Nidhi Ruthia
Rok vydání: 2020
Předmět:
Zdroj: Social Networking and Computational Intelligence ISBN: 9789811520709
DOI: 10.1007/978-981-15-2071-6_25
Popis: Document spam is the term which is related to the document copyright issue. It deals with the plagiarism of content from a genuine copy to another. Many researches are performed around the globe for the sake of improvement in different fields such as medical, technology, and agriculture The original content makes impact on the current scenario improvement. Many organization and individual make use of existing concepts to take credit of others’ work in their profile. Document spamming is not a legal activity, and thus, there are many algorithms are derived by the research authors to avoid such spamming. Challenges behind such approach are processing the data and finding accuracy over the similarity detection. In this paper, a novel QAP-based Rabin-Karp algorithm is proposed. This approach is a combination of score computation using QAP functions and finally similarity measure computation using Rabin-Karp algorithm. The execution of experimental algorithm is performed using Java library along with sample documents. Algorithm is compared with traditional approach which shows the performance of proposed technique in terms of similarity measure, computation time, and throughput as parameter. The application found improvement, and hence, it shows the effectiveness of proposed approach.
Databáze: OpenAIRE