Parallel Latent Dirichlet Allocation Using Vector Processors

Autor: Takuya Araki, Harumichi Yokoyama, Zhongyuan Tian
Rok vydání: 2019
Předmět:
Zdroj: HPCC/SmartCity/DSS
DOI: 10.1109/hpcc/smartcity/dss.2019.00213
Popis: Latent Dirichlet Allocation (LDA) is a widely used machine learning technique for topic modeling. The ever-growing size of training data and topic models has been attracting great attention to parallel LDA implementation. In this paper, we present VLDA, which accelerates LDA training by exploiting the data-level and the thread-level parallelism using vector processors. The priority-aware scheduling approach is proposed to address the high memory requirement and workload imbalance issues with existing works. Experimental results on various datasets demonstrate a 3.2x~12.9x speedup of VLDA over state-of-the-art parallel LDA methods based on an x86 architecture, which are PLDA and LightLDA. We also show that VLDA is capable of processing large-scale datasets with good scalability to multiple vector processors.
Databáze: OpenAIRE