Classification of XML Documents Using Semantic Resources

Autor:	Abdeldjalil Ledmi, Mohammed El Habib Souidi, Makhlouf Ledmi
Rok vydání:	2021
Předmět:	Structure (mathematical logic) ComputingMethodologies_PATTERNRECOGNITION Information retrieval Computer science computer.internet_protocol Similarity (psychology) ComputingMethodologies_DOCUMENTANDTEXTPROCESSING WordNet Ontology (information science) computer XML
Zdroj:	2021 International Conference on Recent Advances in Mathematics and Informatics (ICRAMI).
Popis:	In this paper, we investigate the automatic classification of XML documents into predefined categories. We propose to develop a classification model by combining the content and structure of documents. Furthermore, we propose to use semantic resources, specifically WordNet and ontology linked to the terms of the corpus, in order to model the notion of the semantic neighborhood by using a calculation regarding the similarity between terms.To validate the results, we used the INEX 2007 XML corpus.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::ab22cd6eba852c9eabee7b13099b9524 https://doi.org/10.1109/icrami52622.2021.9585995 Zobrazit plný text záznamu