Výsledky vyhledávání

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

Autor: Shaojin Ding, Wang Weiran, Ding Zhao, Tara Sainath, Yanzhang He, Robert David, Rami Botros, Xin Wang, Rina Panigrahy‎, Qiao Liang, Dongseong Hwang, Ian McGraw, Rohit Prabhavalkar, Trevor Strohman

In this paper, we propose a dynamic cascaded encoder Automatic Speech Recognition (ASR) model, which unifies models for different deployment scenarios. Moreover, the model can significantly reduce model size and power consumption without loss of qual

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d8444eebad257eba5a9ce7891486e95e
http://arxiv.org/abs/2204.06164

Zobrazit plný text záznamu

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction

Autor: David Qiu, Ian McGraw, Yu Zhang, Yanzhang He, Qiujia Li, Liangliang Cao

Publikováno v: Interspeech 2021.

Confidence scores are very useful for downstream applications of automatic speech recognition (ASR) systems. Recent works have proposed using neural networks to learn word or utterance confidence scores for end-to-end ASR. In those studies, word conf

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3af254bc9b3eab1095aaca5c9994b6fa
https://doi.org/10.21437/interspeech.2021-1207

Zobrazit plný text záznamu

Learning Word-Level Confidence for Subword End-To-End ASR

Autor: Yanzhang He, Ke Hu, Rohit Prabhavalkar, Deepti Bhatia, Yu Zhang, Wei Li, David Qiu, Qiujia Li, Tara N. Sainath, Bo Li, Ian McGraw, Liangliang Cao

Publikováno v: ICASSP

We study the problem of word-level confidence estimation in subword-based end-to-end (E2E) models for automatic speech recognition (ASR). Although prior works have proposed training auxiliary confidence models for ASR systems, they do not extend natu

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::57fe29489a86717bd8521922ad80c736
https://doi.org/10.1109/icassp39728.2021.9413966

Zobrazit plný text záznamu

Multi-user VoiceFilter-Lite via Attentive Speaker Embedding

Autor: Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw

In this paper, we propose a solution to allow speaker conditioned speech models, such as VoiceFilter-Lite, to support an arbitrary number of enrolled users in a single pass. This is achieved by using an attention mechanism on multiple speaker embeddi

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::308a8e8a0f4ff2e3f0fd373a0cd2d01a

Zobrazit plný text záznamu

Analyzing the Quality and Stability of a Streaming End-to-End On-Device Speech Recognizer

Autor: Yuan Shangguan, Kate Knister, Yanzhang He, Ian McGraw, Francoise Beaufays

Publikováno v: INTERSPEECH

The demand for fast and accurate incremental speech recognition increases as the applications of automatic speech recognition (ASR) proliferate. Incremental speech recognizers output chunks of partially recognized words while the user is still talkin

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5eeb82a8c596cc512b5c769a7746f3e9
http://arxiv.org/abs/2006.01416

Zobrazit plný text záznamu

Rupture of an Occult Intracranial Mycotic Aneurysm after Intravenous Thrombolysis with Recombinant Tissue Plasminogen Activator for Acute Ischemic Stroke

Autor: Nathan A. James, Ian McGraw, C. Keith Stone, Margaret K. Strecker-McGraw, Jared Glenn, Karim Jabbar

Publikováno v: The Journal of Emergency Medicine. 53:717-721

Background The treatment of acute ischemic stroke with recombinant tissue plasminogen activator (rtPA) has become the mainstay of treatment, but its use carries a risk of subsequent intracranial hemorrhage (ICH). Guidelines have been developed to aid

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::02f29eeea71262e204b0019a74e5c1ce
https://doi.org/10.1016/j.jemermed.2017.08.032

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání