Tuning semantic association for modelling textual data
Autor: | Juan M. Otero, Ansel Y. Rodriguez, Jose E. Medina-Pagola |
---|---|
Rok vydání: | 2011 |
Předmět: |
GADM
business.industry Computer science Association (object-oriented programming) Information processing computer.software_genre Feature (linguistics) Simulated annealing Vector space model Piecewise Data mining Artificial intelligence business Representation (mathematics) computer Natural language processing |
Zdroj: | RCIS |
DOI: | 10.1109/rcis.2011.6006821 |
Popis: | Text information processing depends critically on the proper representation of documents. Traditional models, like the vector space model, have significant limitations because they do not consider semantic relations amongst terms. Global Association Distance Model (GADM) is an alternative that includes this consideration for document representation, assuming basically that two documents should be closer if the shortest formal distances amongst terms in each document are similar. The association strength function used to model the semantic relations among terms, based on its formal distances is a critical feature of GADM. In this paper the association strength function is analyzed, a family of piecewise association strength functions is proposed and a Simulated Annealing algorithm is used to tune it and to obtain an optimal model of semantic relation. We evaluate this significance for topic classification task. |
Databáze: | OpenAIRE |
Externí odkaz: |