Tuning semantic association for modelling textual data

Autor: Juan M. Otero, Ansel Y. Rodriguez, Jose E. Medina-Pagola
Rok vydání: 2011
Předmět:
Zdroj: RCIS
DOI: 10.1109/rcis.2011.6006821
Popis: Text information processing depends critically on the proper representation of documents. Traditional models, like the vector space model, have significant limitations because they do not consider semantic relations amongst terms. Global Association Distance Model (GADM) is an alternative that includes this consideration for document representation, assuming basically that two documents should be closer if the shortest formal distances amongst terms in each document are similar. The association strength function used to model the semantic relations among terms, based on its formal distances is a critical feature of GADM. In this paper the association strength function is analyzed, a family of piecewise association strength functions is proposed and a Simulated Annealing algorithm is used to tune it and to obtain an optimal model of semantic relation. We evaluate this significance for topic classification task.
Databáze: OpenAIRE