Automatic document classification based on latent semantic analysis
Autor: | Igor Nekrestyanov, Igor Kuralenok |
---|---|
Rok vydání: | 2000 |
Předmět: | |
Zdroj: | Programming and Computer Software. 26:199-206 |
ISSN: | 1608-3261 0361-7688 |
DOI: | 10.1007/bf02759469 |
Popis: | In this paper, the problem of automatic document classification by a set of given topics is considered. The method proposed is based on the use of the latent semantic analysis to retrieve semantic dependencies between words. The classification of document is based on these dependencies. The results of experiments performed on the basis of the standard test data set TREC (Text REtrieval Conference) confirm the attractiveness of this approach. The relatively low computational complexity of this method at the classification stage makes it possible to be applied to the classification of document streams. |
Databáze: | OpenAIRE |
Externí odkaz: |