Bottle-Neck Feature Extraction Structures for Multilingual Training and Porting
Autor: | Frantisek Grezl, Martin Karafiat |
---|---|
Rok vydání: | 2016 |
Předmět: |
multilingual training
Computer science business.industry feature extraction Speech recognition Feature extraction system porting 02 engineering and technology 010501 environmental sciences computer.software_genre Stacked Bottle-Neck 01 natural sciences Porting Triphone Bottle neck 0202 electrical engineering electronic engineering information engineering General Earth and Planetary Sciences 020201 artificial intelligence & image processing Artificial intelligence business DNN topology computer Natural language processing 0105 earth and related environmental sciences General Environmental Science |
Zdroj: | Procedia Computer Science SLTU |
ISSN: | 1877-0509 |
DOI: | 10.1016/j.procs.2016.04.042 |
Popis: | Stacked-Bottle-Neck (SBN) feature extraction is a crucial part of modern automatic speech recognition (ASR) systems. The SBN network traditionally contains a hidden layer between the BN and output layers. Recently, we have observed that an SBN architecture without this hidden layer (i.e. direct BN-layer – output-layer connection) performs better for a single language but fails in scenarios where a network pre-trained in multilingual fashion is ported to a target language. In this paper, we describe two strategies allowing the direct-connection SBN network to indeed benefit from pre-training with a multilingual net: (1) pre-training multilingual net with the hidden layer which is discarded before porting to the target language and (2) using only the the direct- connection SBN with triphone targets both in multilingual pre-training and porting to the target language. The results are reported on IARPA-BABEL limited language pack (LLP) data. |
Databáze: | OpenAIRE |
Externí odkaz: |