On efficient training of word classes and their application to recurrent neural network language models
Autor: | Rami Botros, Kazuki Irie, Hermann Ney, Martin Sundermeyer |
---|---|
Rok vydání: | 2015 |
Předmět: |
Brown clustering
Perplexity Artificial neural network business.industry Computer science Speech recognition Word error rate Machine learning computer.software_genre Recurrent neural network Discriminative model Language model Artificial intelligence business Cluster analysis computer Word (computer architecture) |
Zdroj: | INTERSPEECH |
DOI: | 10.21437/interspeech.2015-345 |
Popis: | In this paper, we investigated various word clustering methods, by studying two clustering algorithms: Brown clustering and exchange algorithm, and three objective functions derived from different class-based language models (CBLM): two-sided, predictive and conditional models. In particular, we focused on the implementation of the exchange algorithm with improved speed. In total, we compared six clustering methods in terms of runtime and perplexity (PP) of the CBLM on a French corpus, and show that our accelerated implementation of exchange algorithm is up to 114 times faster than the original and around 6 times faster than the best implementation of Brown clustering we could find, while performing about the same (slightly better) in PP. In addition, we conducted a keyword search experiment on the Babel Lithuanian task (IARPA-babel304b-v1.0b), which showed that CBLM improves the word error rate (WER) but not the keyword search performance. Furthermore, we used these clustering techniques for the output layer of a recurrent neural network (RNN) language model (LM) and we show that in terms of PP of the RNN LM, word classes trained under the predictive model perform slightly better than those trained under other criteria we considered. Index Terms: word clustering, language modeling, neural network based language model, recurrent neural network, long short-term memory |
Databáze: | OpenAIRE |
Externí odkaz: |