Parallel Latent Dirichlet Allocation Using Vector Processors
Autor: | Takuya Araki, Harumichi Yokoyama, Zhongyuan Tian |
---|---|
Rok vydání: | 2019 |
Předmět: |
Topic model
symbols.namesake Training set Speedup Computer science 020204 information systems 0202 electrical engineering electronic engineering information engineering symbols 020201 artificial intelligence & image processing 02 engineering and technology Parallel computing Latent Dirichlet allocation |
Zdroj: | HPCC/SmartCity/DSS |
DOI: | 10.1109/hpcc/smartcity/dss.2019.00213 |
Popis: | Latent Dirichlet Allocation (LDA) is a widely used machine learning technique for topic modeling. The ever-growing size of training data and topic models has been attracting great attention to parallel LDA implementation. In this paper, we present VLDA, which accelerates LDA training by exploiting the data-level and the thread-level parallelism using vector processors. The priority-aware scheduling approach is proposed to address the high memory requirement and workload imbalance issues with existing works. Experimental results on various datasets demonstrate a 3.2x~12.9x speedup of VLDA over state-of-the-art parallel LDA methods based on an x86 architecture, which are PLDA and LightLDA. We also show that VLDA is capable of processing large-scale datasets with good scalability to multiple vector processors. |
Databáze: | OpenAIRE |
Externí odkaz: |