Zobrazeno 1 - 10
of 34
pro vyhledávání: '"Moreno, Ignacio Lopez"'
Autor:
Labrador, Beltrán, Zhu, Pai, Zhao, Guanlong, Scarpati, Angelo Scorza, Wang, Quan, Lozano-Diez, Alicia, Park, Alex, Moreno, Ignacio López
Keyword spotting systems often struggle to generalize to a diverse population with various accents and age groups. To address this challenge, we propose a novel approach that integrates speaker information into keyword spotting using Feature-wise Lin
Externí odkaz:
http://arxiv.org/abs/2311.03419
A Multilingual Keyword Spotting (KWS) system detects spokenkeywords over multiple locales. Conventional monolingual KWSapproaches do not scale well to multilingual scenarios because ofhigh development/maintenance costs and lack of resource sharing.To
Externí odkaz:
http://arxiv.org/abs/2302.12961
In this work we propose a novel token-based training strategy that improves Transformer-Transducer (T-T) based speaker change detection (SCD) performance. The conventional T-T based SCD model loss optimizes all output tokens equally. Due to the spars
Externí odkaz:
http://arxiv.org/abs/2211.06482
Autor:
Labrador, Beltrán, Zhao, Guanlong, Moreno, Ignacio López, Scarpati, Angelo Scorza, Fowl, Liam, Wang, Quan
In this paper, we present a novel approach to adapt a sequence-to-sequence Transformer-Transducer ASR system to the keyword spotting (KWS) task. We achieve this by replacing the keyword in the text transcription with a special token and training
Externí odkaz:
http://arxiv.org/abs/2211.06478
While recent research advances in speaker diarization mostly focus on improving the quality of diarization results, there is also an increasing interest in improving the efficiency of diarization systems. In this paper, we demonstrate that a multi-st
Externí odkaz:
http://arxiv.org/abs/2210.13690
Autor:
Hard, Andrew, Partridge, Kurt, Chen, Neng, Augenstein, Sean, Shah, Aishanee, Park, Hyun Jin, Park, Alex, Ng, Sara, Nguyen, Jessica, Moreno, Ignacio Lopez, Mathews, Rajiv, Beaufays, Françoise
We trained a keyword spotting model using federated learning on real user devices and observed significant improvements when the model was deployed for inference on phones. To compensate for data domains that are missing from on-device training cache
Externí odkaz:
http://arxiv.org/abs/2204.06322
This paper presents a novel study of parameter-free attentive scoring for speaker verification. Parameter-free scoring provides the flexibility of comparing speaker representations without the need of an accompanying parametric scoring model. Inspire
Externí odkaz:
http://arxiv.org/abs/2203.05642
Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech
In this paper, we introduce a novel language identification system based on conformer layers. We propose an attentive temporal pooling mechanism to allow the model to carry information in long-form audio via a recurrent form, such that the inference
Externí odkaz:
http://arxiv.org/abs/2202.12163
Autor:
Xia, Wei, Lu, Han, Wang, Quan, Tripathi, Anshuman, Huang, Yiling, Moreno, Ignacio Lopez, Sak, Hasim
In this paper, we present a novel speaker diarization system for streaming on-device applications. In this system, we use a transformer transducer to detect the speaker turns, represent each speaker turn by a speaker embedding, then cluster these emb
Externí odkaz:
http://arxiv.org/abs/2109.11641
We propose self-training with noisy student-teacher approach for streaming keyword spotting, that can utilize large-scale unlabeled data and aggressive data augmentation. The proposed method applies aggressive data augmentation (spectral augmentation
Externí odkaz:
http://arxiv.org/abs/2106.01604