New Descriptors of Textual Records: Getting Help from Frequent Itemsets

Autor:	Nadia Ghazzali, Ayoub Bokhabrine, Ismaïl Biskri
Rok vydání:	2020
Předmět:	020203 distributed computing Information retrieval lcsh:T58.5-58.64 k-medoids lcsh:Information technology Process (engineering) Computer science Social activity 02 engineering and technology ascending hierarchical clustering lcsh:QA75.5-76.95 frequent itemsets 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing lcsh:Electronic computers. Computer science
Zdroj:	Vietnam Journal of Computer Science, Vol 7, Iss 4, Pp 355-372 (2020)
ISSN:	2196-8896 2196-8888
DOI:	10.1142/s2196888820500207
Popis:	The analysis of numerical data, whether structured, semi-structured, or raw, is of paramount importance in many sectors of economic, scientific, or simply social activity. The process of extraction of association rules is based on the lexical quality of the text and on the minimum support set by the user. In this paper, we implemented a platform named “IDETEX” capable of extracting itemsets from textual data and using it for the experimentation in different types of clustering methods, such as [Formula: see text]-Medoids and Hierarchical clustering. The experiments conducted demonstrate the potential of the proposed approach for defining similarity between segments.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b19f4d343aa7170e1ee2a0ba4877a17c https://doi.org/10.1142/s2196888820500207 Zobrazit plný text záznamu