Výsledky vyhledávání

Report

Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models

Autor: Hernandez, Steven M., Zhao, Ding, Ding, Shaojin, Bruguier, Antoine, Prabhavalkar, Rohit, Sainath, Tara N., He, Yanzhang, McGraw, Ian

Continued improvements in machine learning techniques offer exciting new opportunities through the use of larger models and larger training datasets. However, there is a growing need to offer these new capabilities on-board low-powered devices such a

Externí odkaz: http://arxiv.org/abs/2303.08343

Zobrazit plný text záznamu

Report

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

Autor: Ding, Shaojin, Wang, Weiran, Zhao, Ding, Sainath, Tara N., He, Yanzhang, David, Robert, Botros, Rami, Wang, Xin, Panigrahy, Rina, Liang, Qiao, Hwang, Dongseong, McGraw, Ian, Prabhavalkar, Rohit, Strohman, Trevor

In this paper, we propose a dynamic cascaded encoder Automatic Speech Recognition (ASR) model, which unifies models for different deployment scenarios. Moreover, the model can significantly reduce model size and power consumption without loss of qual

Externí odkaz: http://arxiv.org/abs/2204.06164

Zobrazit plný text záznamu

Report

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition

Autor: Ding, Shaojin, Rikhye, Rajeev, Liang, Qiao, He, Yanzhang, Wang, Quan, Narayanan, Arun, O'Malley, Tom, McGraw, Ian

Personalization of on-device speech recognition (ASR) has seen explosive growth in recent years, largely due to the increasing popularity of personal assistant features on mobile devices and smart home speakers. In this work, we present Personal VAD

Externí odkaz: http://arxiv.org/abs/2204.03793

Zobrazit plný text záznamu

Report

Closing the Gap between Single-User and Multi-User VoiceFilter-Lite

Autor: Rikhye, Rajeev, Wang, Quan, Liang, Qiao, He, Yanzhang, McGraw, Ian

VoiceFilter-Lite is a speaker-conditioned voice separation model that plays a crucial role in improving speech recognition and speaker verification by suppressing overlapping speech from non-target speakers. However, one limitation of VoiceFilter-Lit

Externí odkaz: http://arxiv.org/abs/2202.12169

Zobrazit plný text záznamu

Akademický článek

Morel-Lavallee associated lymphedema treated with lymphovenous anastomosis: A case report

Autor: Sarrami, Shayan M., Douglas, Nerone, McGraw, Ian, Parent, Brodie, Cruz, Carolyn De La

Publikováno v: In Injury November 2024 55(11)

Zobrazit plný text záznamu

Report

Multi-user VoiceFilter-Lite via Attentive Speaker Embedding

Autor: Rikhye, Rajeev, Wang, Quan, Liang, Qiao, He, Yanzhang, McGraw, Ian

In this paper, we propose a solution to allow speaker conditioned speech models, such as VoiceFilter-Lite, to support an arbitrary number of enrolled users in a single pass. This is achieved by using an attention mechanism on multiple speaker embeddi

Externí odkaz: http://arxiv.org/abs/2107.01201

Zobrazit plný text záznamu

Report

Personalized Keyphrase Detection using Speaker and Environment Information

Autor: Rikhye, Rajeev, Wang, Quan, Liang, Qiao, He, Yanzhang, Zhao, Ding, Yiteng, Huang, Narayanan, Arun, McGraw, Ian

In this paper, we introduce a streaming keyphrase detection system that can be easily customized to accurately detect any phrase composed of words from a large vocabulary. The system is implemented with an end-to-end trained automatic speech recognit

Externí odkaz: http://arxiv.org/abs/2104.13970

Zobrazit plný text záznamu

Report

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction

Autor: Qiu, David, He, Yanzhang, Li, Qiujia, Zhang, Yu, Cao, Liangliang, McGraw, Ian

Confidence scores are very useful for downstream applications of automatic speech recognition (ASR) systems. Recent works have proposed using neural networks to learn word or utterance confidence scores for end-to-end ASR. In those studies, word conf

Externí odkaz: http://arxiv.org/abs/2104.12870

Zobrazit plný text záznamu

Report

Learning Word-Level Confidence For Subword End-to-End ASR

Autor: Qiu, David, Li, Qiujia, He, Yanzhang, Zhang, Yu, Li, Bo, Cao, Liangliang, Prabhavalkar, Rohit, Bhatia, Deepti, Li, Wei, Hu, Ke, Sainath, Tara N., McGraw, Ian

We study the problem of word-level confidence estimation in subword-based end-to-end (E2E) models for automatic speech recognition (ASR). Although prior works have proposed training auxiliary confidence models for ASR systems, they do not extend natu

Externí odkaz: http://arxiv.org/abs/2103.06716

Zobrazit plný text záznamu

Report

Analyzing the Quality and Stability of a Streaming End-to-End On-Device Speech Recognizer

Autor: Shangguan, Yuan, Knister, Kate, He, Yanzhang, McGraw, Ian, Beaufays, Francoise

The demand for fast and accurate incremental speech recognition increases as the applications of automatic speech recognition (ASR) proliferate. Incremental speech recognizers output chunks of partially recognized words while the user is still talkin

Externí odkaz: http://arxiv.org/abs/2006.01416

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání