Zobrazeno 1 - 10
of 27
pro vyhledávání: '"Fadi Biadsy"'
Model fine-tuning and adaptation have become a common approach for model specialization for downstream tasks or domains. Fine-tuning the entire model or a subset of the parameters using light-weight adaptation has shown considerable success across di
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5c8583bec2d2cd8544f8535380f04029
http://arxiv.org/abs/2203.12559
http://arxiv.org/abs/2203.12559
Autor:
Pedro J. Moreno Mengibar, Fadi Biadsy, Bhuvana Ramabhadran, Liyang Jiang, Xia Zhang, Rohan Doshi, Zhehuai Chen, Youzheng Chen, Andrea Chu
Publikováno v:
Interspeech 2021.
Autor:
Fang Chu, Youzheng Chen, Rohan Doshi, Fadi Biadsy, Xia Zhang, Liyang Jiang, Andrew Rosenberg, Pedro J. Moreno, Bhuvana Ramabhadran
Publikováno v:
ICASSP
We present an extended Parrotron model: a single, end-to-end network that enables voice conversion and recognition simultaneously. Input spectrograms are transformed to output spectrograms in the voice of a predetermined target speaker while also gen
Publikováno v:
ICASSP
Code-switching occurs when the speaker alternates between two or more languages or dialects. It is a pervasive phenomenon in most Indic spoken languages. Code-switching poses a challenge in language modeling as it complicates the orthographic realiza
Publikováno v:
INTERSPEECH
We present an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation. The network is trained end-to-end, learni
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::15becfa04d691db2c4fd7fdf9e739506
Publikováno v:
INTERSPEECH
We describe Parrotron, an end-to-end-trained speech-to-speech conversion model that maps an input spectrogram directly to another spectrogram, without utilizing any intermediate discrete representation. The network is composed of an encoder, spectrog
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c73b391808afd72766be4ed8cfc5d4d7
Publikováno v:
ICASSP
Language Models (LMs) for Automatic Speech Recognition (ASR) can benefit from utilizing non-linguistic contextual signals in modeling. Examples of these signals include the geographical location of the user speaking to the system and/or the identity
Publikováno v:
INTERSPEECH
Publikováno v:
INTERSPEECH
Publikováno v:
INTERSPEECH