A Survey of Recent DNN Architectures on the TIMIT Phone Recognition Task
Autor: | Michalek, Josef, Vanek, Jan |
---|---|
Rok vydání: | 2018 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | In this survey paper, we have evaluated several recent deep neural network (DNN) architectures on a TIMIT phone recognition task. We chose the TIMIT corpus due to its popularity and broad availability in the community. It also simulates a low-resource scenario that is helpful in minor languages. Also, we prefer the phone recognition task because it is much more sensitive to an acoustic model quality than a large vocabulary continuous speech recognition (LVCSR) task. In recent years, many DNN published papers reported results on TIMIT. However, the reported phone error rates (PERs) were often much higher than a PER of a simple feed-forward (FF) DNN. That was the main motivation of this paper: To provide a baseline DNNs with open-source scripts to easily replicate the baseline results for future papers with lowest possible PERs. According to our knowledge, the best-achieved PER of this survey is better than the best-published PER to date. Comment: Submitted to TSD 2018, 21st International Conference on Text, Speech and Dialogue. arXiv admin note: substantial text overlap with arXiv:1806.07186 |
Databáze: | arXiv |
Externí odkaz: |