Çizge Benzerliği Yöntemi ile Doküman Sınıflandırma

Autor: Taner Uçkan, Faruk Ayata, Ali Karci, Cengiz Hark, Ebubekir Seyyarer
Rok vydání: 2018
Předmět:
Zdroj: 2018 International Conference on Artificial Intelligence and Data Processing (IDAP).
DOI: 10.1109/idap.2018.8620926
Popis: The classification of the documents is at the beginning of the topics that are studied extensively today. Using text similarity, many areas are used, such as whether citations are quoted elsewhere or the information searched in search engines is fast and accurate. A variety of methods are used while looking for similarities between documents. Similarity measurements are made by two basic methods, word-based and sentence-based, during the comparison of several documents. While word-based similarity measurements are made, many distance measurement methods such as Jaccard, Dice, Cosine similarity are used. In this study, the paragraphs in different documents will be broken down by sentence basis and they will be represented by a graph, and a study will be done on the classification of the documents using Hamming distance measurements by XOR method of neighborhood matrices obtained from these documents.
Databáze: OpenAIRE