PLSA efficiency improvement based on initialization and approximation

Autor: Avanesov, V., Kozlov, I.
Jazyk: ruština
Rok vydání: 2014
Předmět:
Zdroj: Вестник Новгородского государственного университета им. Ярослава Мудрого.
ISSN: 2076-8052
Popis: Probabilistic Latent Semantic Analysis (PLSA) is an effective technique for information retrieval, but it has a serious drawback: it consumes a huge amount of computational resources, so it is hard to train this model on a large collection of documents. The aim of this paper is to improve time efficiency of the training algorithm. Two different approaches are explored: one is based on efficient finding of an appropriate initial approximation; the idea of another is that for the most of collection topics may be extracted from relatively small fraction of the data.
Databáze: OpenAIRE