Automatic document classification based on latent semantic analysis

Autor: Igor Nekrestyanov, Igor Kuralenok
Rok vydání: 2000
Předmět:
Zdroj: Programming and Computer Software. 26:199-206
ISSN: 1608-3261
0361-7688
DOI: 10.1007/bf02759469
Popis: In this paper, the problem of automatic document classification by a set of given topics is considered. The method proposed is based on the use of the latent semantic analysis to retrieve semantic dependencies between words. The classification of document is based on these dependencies. The results of experiments performed on the basis of the standard test data set TREC (Text REtrieval Conference) confirm the attractiveness of this approach. The relatively low computational complexity of this method at the classification stage makes it possible to be applied to the classification of document streams.
Databáze: OpenAIRE