Classification of XML Documents Using Semantic Resources

Autor: Abdeldjalil Ledmi, Mohammed El Habib Souidi, Makhlouf Ledmi
Rok vydání: 2021
Předmět:
Zdroj: 2021 International Conference on Recent Advances in Mathematics and Informatics (ICRAMI).
Popis: In this paper, we investigate the automatic classification of XML documents into predefined categories. We propose to develop a classification model by combining the content and structure of documents. Furthermore, we propose to use semantic resources, specifically WordNet and ontology linked to the terms of the corpus, in order to model the notion of the semantic neighborhood by using a calculation regarding the similarity between terms.To validate the results, we used the INEX 2007 XML corpus.
Databáze: OpenAIRE