Information fusion for text classification — an experimental comparison

Autor: Venu Dasigi, Reinhold C. Mann, Vladimir Protopopescu
Rok vydání: 2001
Předmět:
Zdroj: Pattern Recognition. 34:2413-2425
ISSN: 0031-3203
DOI: 10.1016/s0031-3203(00)00171-0
Popis: This article reports on our experiments and results on the effectiveness of different feature sets and information fusion from some combinations of them in classifying free text documents into a given number of categories. We use different feature sets and integrate neural network learning into the method. The feature sets are based on the "latent semantics" of a reference library - a collection of documents adequately representing the desired concepts. We found that a larger reference library is not necessarily better. Information fusion almost always gives better results than the individual constituent feature sets, with certain combinations doing better than the others.
Databáze: OpenAIRE