Zobrazeno 1 - 10
of 90
pro vyhledávání: '"Michael L. Seltzer"'
Autor:
Andros Tjandra, Nayan Singhal, David Zhang, Ozlem Kalinli, Abdelrahman Mohamed, Duc Le, Michael L. Seltzer
End-to-end multilingual ASR has become more appealing because of several reasons such as simplifying the training and deployment process and positive performance transfer from high-resource to low-resource languages. However, scaling up the number of
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f736edcbf4036a4038a642fa6cb5d292
http://arxiv.org/abs/2211.05756
http://arxiv.org/abs/2211.05756
Autor:
Antoine Bruguier, Duc Le, Rohit Prabhavalkar, Dangna Li, Zhe Liu, Bo Wang, Eun Chang, Fuchun Peng, Ozlem Kalinli, Michael L. Seltzer
We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech recognition, a novel method that combines neural network language models (NNLMs) and finite state transducers (FSTs) in a mathematically consistent framework. Our method utilize
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c5040630a954cf9a93faa45dc9282ccd
Autor:
Yangyang Shi, Ozlem Kalinli, Ganesh Venkatesh, Varun K. Nagaraja, Vikas Chandra, Michael L. Seltzer
Publikováno v:
Interspeech 2021.
On-device speech recognition requires training models of different sizes for deploying on devices with various computational budgets. When building such different models, we can benefit from training them jointly to take advantage of the knowledge sh
Autor:
Christian Fuegen, Abhinav Arora, Michael L. Seltzer, Ching-Feng Yeh, Suyoun Kim, Ozlem Kalinli, Duc Le
Publikováno v:
Interspeech 2021.
Word Error Rate (WER) has been the predominant metric used to evaluate the performance of automatic speech recognition (ASR) systems. However, WER is sometimes not a good indicator for downstream Natural Language Understanding (NLU) tasks, such as in
Autor:
Jay Mahadeokar, Alex Xiao, Christian Fuegen, Duc Le, Michael L. Seltzer, Yuan Shangguan, Chunyang Wu, Hang Su, Ozlem Kalinli, Yangyang Shi
Publikováno v:
Interspeech 2021.
Autor:
Michael L. Seltzer, Reinhold Haeb-Umbach, Shinji Watanabe, Bjorn Hoffmeister, Heiga Zen, Michiel Bacchiani, Mehrez Souden, Tomohiro Nakatani
Publikováno v:
IEEE Signal Processing Magazine. 36:111-124
Once a popular theme of futuristic science fiction or far-fetched technology forecasts, digital home assistants with a spoken language interface have become a ubiquitous commodity today. This success has been made possible by major advancements in si
Autor:
Ganesh Venkatesh, Alagappan Valliappan, Christian Fuegen, Jay Mahadeokar, Michael L. Seltzer, Vikas Chandra, Yuan Shangguan
Publikováno v:
ICASSP
Recurrent transducer models have emerged as a promising solution for speech recognition on the current and next generation smart devices. The transducer models provide competitive accuracy within a reasonable memory footprint alleviating the memory c
Autor:
Ching-Feng Yeh, Ozlem Kalinli, Yangyang Shi, Chunyang Wu, Rohit Prabhavalkar, Alex Xiao, Christian Fuegen, Duc Le, Michael L. Seltzer, Varun K. Nagaraja, Julian Chan, Jay Mahadeokar
We propose a dynamic encoder transducer (DET) for on-device speech recognition. One DET model scales to multiple devices with different computation capacities without retraining or finetuning. To trading off accuracy and latency, DET assigns differen
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c19b870d3c9858d0288619e077dd8bb4
http://arxiv.org/abs/2104.02176
http://arxiv.org/abs/2104.02176
Autor:
Chunyang Wu, Jiatong Zhou, Christian Fuegen, Ozlem Kalinli, Hang Su, Duc Le, Yuan Shangguan, Jay Mahadeokar, Rohit Prabhavalkar, Michael L. Seltzer, Yangyang Shi
As speech-enabled devices such as smartphones and smart speakers become increasingly ubiquitous, there is growing interest in building automatic speech recognition (ASR) systems that can run directly on-device; end-to-end (E2E) speech recognition mod
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0a19a4e21dc2dcdbbb3904ee54bc5d6d
Autor:
Yatharth Saraf, Michael L. Seltzer, Suyoun Kim, Christian Fuegen, Duc Le, Yangyang Shi, Ozlem Kalinli, Julian Chan, Gil Keren, Yuan Shangguan, Mahaveer Jain, Jay Mahadeokar
How to leverage dynamic contextual information in end-to-end speech recognition has remained an active research area. Previous solutions to this problem were either designed for specialized use cases that did not generalize well to open-domain scenar
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6e25c8e6ece4b166bb1d6fa115444004