Výsledky vyhledávání - "Schubert, Kjell"

Report

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers

Autor: Le, Duc, Seide, Frank, Wang, Yuhao, Li, Yang, Schubert, Kjell, Kalinli, Ozlem, Seltzer, Michael L.

We show how factoring the RNN-T's output distribution can significantly reduce the computation cost and power consumption for on-device ASR inference with no loss in accuracy. With the rise in popularity of neural-transducer type models like the RNN-

Externí odkaz: http://arxiv.org/abs/2211.00896

Zobrazit plný text záznamu

Report

Improving Data Driven Inverse Text Normalization using Data Augmentation

Autor: Pandey, Laxmi, Paul, Debjyoti, Chitkara, Pooja, Pang, Yutong, Zhang, Xuedong, Schubert, Kjell, Chou, Mark, Liu, Shu, Saraf, Yatharth

Inverse text normalization (ITN) is used to convert the spoken form output of an automatic speech recognition (ASR) system to a written form. Traditional handcrafted ITN rules can be complex to transcribe and maintain. Meanwhile neural modeling appro

Externí odkaz: http://arxiv.org/abs/2207.09674

Zobrazit plný text záznamu

Report

Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR

Autor: Zhang, Xiaohui, Zhang, Frank, Liu, Chunxi, Schubert, Kjell, Chan, Julian, Prakash, Pradyot, Liu, Jun, Yeh, Ching-Feng, Peng, Fuchun, Saraf, Yatharth, Zweig, Geoffrey

In this work, to measure the accuracy and efficiency for a latency-controlled streaming automatic speech recognition (ASR) application, we perform comprehensive evaluations on three popular training criteria: LF-MMI, CTC and RNN-T. In transcribing so

Externí odkaz: http://arxiv.org/abs/2011.04785

Zobrazit plný text záznamu

Report

RNN-T For Latency Controlled ASR With Improved Beam Search

Autor: Jain, Mahaveer, Schubert, Kjell, Mahadeokar, Jay, Yeh, Ching-Feng, Kalgaonkar, Kaustubh, Sriram, Anuroop, Fuegen, Christian, Seltzer, Michael L.

Neural transducer-based systems such as RNN Transducers (RNN-T) for automatic speech recognition (ASR) blend the individual components of a traditional hybrid ASR systems (acoustic model, language model, punctuation model, inverse text normalization)

Externí odkaz: http://arxiv.org/abs/1911.01629

Zobrazit plný text záznamu

Report

Transformer-Transducer: End-to-End Speech Recognition with Self-Attention

Autor: Yeh, Ching-Feng, Mahadeokar, Jay, Kalgaonkar, Kaustubh, Wang, Yongqiang, Le, Duc, Jain, Mahaveer, Schubert, Kjell, Fuegen, Christian, Seltzer, Michael L.

We explore options to use Transformer networks in neural transducer for end-to-end speech recognition. Transformer networks use self-attention for sequence modeling and comes with advantages in parallel computation and capturing contexts. We propose

Externí odkaz: http://arxiv.org/abs/1910.12977

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání