Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Huang, Jocelyn"'
Grapheme-to-phoneme (G2P) transduction is part of the standard text-to-speech (TTS) pipeline. However, G2P conversion is difficult for languages that contain heteronyms -- words that have one spelling but can be pronounced in multiple ways. G2P datas
Externí odkaz:
http://arxiv.org/abs/2302.14523
In this work, we propose a zero-shot voice conversion method using speech representations trained with self-supervised learning. First, we develop a multi-task model to decompose a speech utterance into features such as linguistic content, speaker ch
Externí odkaz:
http://arxiv.org/abs/2302.08137
Autor:
Hameed, Isha, Sharpe, Samuel, Barcklow, Daniel, Au-Yeung, Justin, Verma, Sahil, Huang, Jocelyn, Barr, Brian, Bruss, C. Bayan
Explainable artificial intelligence (XAI) methods lack ground truth. In its place, method developers have relied on axioms to determine desirable properties for their explanations' behavior. For high stakes uses of machine learning that require expla
Externí odkaz:
http://arxiv.org/abs/2207.05566
Autor:
Song, Sikai, Cheng, Kai Wen, Farkouh, Ala'a, Carlson, Jason, Ritchie, Cayde, Kuang, Ruby, Wilkinson, Daniel, Buell, Matthew, Pearce, Joshua, Miles, Levi, Huang, Jocelyn, Chamberlin, David A., Chamberlin, Joshua D.
Publikováno v:
In Journal of Pediatric Urology October 2024
Autor:
Balam, Jagadeesh, Huang, Jocelyn, Lavrukhin, Vitaly, Deng, Slyne, Majumdar, Somshubra, Ginsburg, Boris
We present our experiments in training robust to noise an end-to-end automatic speech recognition (ASR) model using intensive data augmentation. We explore the efficacy of fine-tuning a pre-trained model to improve noise robustness, and we find it to
Externí odkaz:
http://arxiv.org/abs/2010.12715
Autor:
Huang, Jocelyn, Kuchaiev, Oleksii, O'Neill, Patrick, Lavrukhin, Vitaly, Li, Jason, Flores, Adriana, Kucsko, Georg, Ginsburg, Boris
In this paper, we demonstrate the efficacy of transfer learning and continuous learning for various automatic speech recognition (ASR) tasks. We start with a pre-trained English ASR model and show that transfer learning can be effectively and easily
Externí odkaz:
http://arxiv.org/abs/2005.04290
Autor:
Kriman, Samuel, Beliaev, Stanislav, Ginsburg, Boris, Huang, Jocelyn, Kuchaiev, Oleksii, Lavrukhin, Vitaly, Leary, Ryan, Li, Jason, Zhang, Yang
We propose a new end-to-end neural acoustic model for automatic speech recognition. The model is composed of multiple blocks with residual connections between them. Each block consists of one or more modules with 1D time-channel separable convolution
Externí odkaz:
http://arxiv.org/abs/1910.10261
Autor:
Kuchaiev, Oleksii, Li, Jason, Nguyen, Huyen, Hrinchuk, Oleksii, Leary, Ryan, Ginsburg, Boris, Kriman, Samuel, Beliaev, Stanislav, Lavrukhin, Vitaly, Cook, Jack, Castonguay, Patrice, Popova, Mariya, Huang, Jocelyn, Cohen, Jonathan M.
NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications through re-usability, abstraction, and composition. NeMo is built around neural modules, conceptual blocks of neural networks that take typed inputs and produce
Externí odkaz:
http://arxiv.org/abs/1909.09577
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.