Automatic document classification based on latent semantic analysis

Autor:	Igor Nekrestyanov, Igor Kuralenok
Rok vydání:	2000
Předmět:	Set (abstract data type) Information retrieval Probabilistic latent semantic analysis Computational complexity theory Explicit semantic analysis Computer science Latent semantic analysis Document classification Document clustering computer.software_genre computer Text Retrieval Conference Software
Zdroj:	Programming and Computer Software. 26:199-206
ISSN:	1608-3261 0361-7688
DOI:	10.1007/bf02759469
Popis:	In this paper, the problem of automatic document classification by a set of given topics is considered. The method proposed is based on the use of the latent semantic analysis to retrieve semantic dependencies between words. The classification of document is based on these dependencies. The results of experiments performed on the basis of the standard test data set TREC (Text REtrieval Conference) confirm the attractiveness of this approach. The relatively low computational complexity of this method at the classification stage makes it possible to be applied to the classification of document streams.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::0121e60352bc9c1c8f1c540e0e032fbf https://doi.org/10.1007/bf02759469 Zobrazit plný text záznamu Full text from SpringerLink