An Approach for Text Mining Based on Noun Phrases

Autor: Edilson Ferneda, Marcelo Ladeira, Hercules Antonio do Prado, Marcello Sandi Pinheiro
Rok vydání: 2015
Předmět:
Zdroj: Intelligent Decision Technologies ISBN: 9783319198569
KES-IDT
Popis: The use of noun phrases as descriptors for text mining vectors has been proposed to overcome the poor semantic of the traditional bag-of-words (BOW). However, the solutions found in the literature are unsatisfactory, mainly due to the use of static definitions for noun phrases and the fact that noun phrases per se do not enable an adequate relevance representation since they are expressions that barely repeat. We present an approach to deal with these problems by (i) introducing a process that enables the definition of noun phrases interactively and (ii) considering similar noun phrases as a unique term. A case study compares both approaches, the one proposed in this paper and the other based on BOW. The main contribution of this paper is the improvement of the preprocessing phase of text mining, leading to better results in the overall process.
Databáze: OpenAIRE