Zobrazeno 1 - 10
of 17
pro vyhledávání: '"Michel, Wilfried"'
We investigate a novel modeling approach for end-to-end neural network training using hidden Markov models (HMM) where the transition probabilities between hidden states are modeled and learned explicitly. Most contemporary sequence-to-sequence model
Externí odkaz:
http://arxiv.org/abs/2310.02724
As one of the most popular sequence-to-sequence modeling approaches for speech recognition, the RNN-Transducer has achieved evolving performance with more and more sophisticated neural network models of growing size and increasing training epochs. Wh
Externí odkaz:
http://arxiv.org/abs/2204.10586
Autor:
Zeineldeen, Mohammad, Xu, Jingjing, Lüscher, Christoph, Michel, Wilfried, Gerstenberger, Alexander, Schlüter, Ralf, Ney, Hermann
The recently proposed conformer architecture has been successfully used for end-to-end automatic speech recognition (ASR) architectures achieving state-of-the-art performance on different datasets. To our best knowledge, the impact of using conformer
Externí odkaz:
http://arxiv.org/abs/2111.03442
To improve the performance of state-of-the-art automatic speech recognition systems it is common practice to include external knowledge sources such as language models or prior corrections. This is usually done via log-linear model combination using
Externí odkaz:
http://arxiv.org/abs/2110.09324
Sequence discriminative training is a great tool to improve the performance of an automatic speech recognition system. It does, however, necessitate a sum over all possible word sequences, which is intractable to compute in practice. Current state-of
Externí odkaz:
http://arxiv.org/abs/2110.09245
Autor:
Zeineldeen, Mohammad, Glushko, Aleksandr, Michel, Wilfried, Zeyer, Albert, Schlüter, Ralf, Ney, Hermann
Attention-based encoder-decoder (AED) models learn an implicit internal language model (ILM) from the training transcriptions. The integration with an external LM trained on much more unpaired text usually leads to better performance. A Bayesian inte
Externí odkaz:
http://arxiv.org/abs/2104.05544
With the success of neural network based modeling in automatic speech recognition (ASR), many studies investigated acoustic modeling and learning of feature extractors directly based on the raw waveform. Recently, one line of research has focused on
Externí odkaz:
http://arxiv.org/abs/2104.04298
We present our transducer model on Librispeech. We study variants to include an external language model (LM) with shallow fusion and subtract an estimated internal LM. This is justified by a Bayesian interpretation where the transducer model prior is
Externí odkaz:
http://arxiv.org/abs/2104.03006
Sequence-to-sequence models with an implicit alignment mechanism (e.g. attention) are closing the performance gap towards traditional hybrid hidden Markov models (HMM) for the task of automatic speech recognition. One important factor to improve word
Externí odkaz:
http://arxiv.org/abs/2005.10049
We present a complete training pipeline to build a state-of-the-art hybrid HMM-based ASR system on the 2nd release of the TED-LIUM corpus. Data augmentation using SpecAugment is successfully applied to improve performance on top of our best SAT model
Externí odkaz:
http://arxiv.org/abs/2004.00960