Zobrazeno 1 - 10
of 27
pro vyhledávání: '"Zhehuai Chen"'
Publikováno v:
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Publikováno v:
IEEE/ACM Transactions on Audio, Speech, and Language Processing. 28:2174-2183
End-to-end (E2E) systems have played a more and more important role in automatic speech recognition (ASR) and achieved great performance. However, E2E systems recognize output word sequences directly with the input acoustic feature, which can only be
Autor:
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran
This paper proposes Virtuoso, a massively multilingual speech-text joint semi-supervised learning framework for text-to-speech synthesis (TTS) models. Existing multilingual TTS typically supports tens of languages, which are a small fraction of the t
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::52500f6059f16130fbb660682cd9fdfb
Autor:
Pedro J. Moreno Mengibar, Fadi Biadsy, Bhuvana Ramabhadran, Liyang Jiang, Xia Zhang, Rohan Doshi, Zhehuai Chen, Youzheng Chen, Andrea Chu
Publikováno v:
Interspeech 2021.
Autor:
Bhuvana Ramabhadran, Mohammadreza Ghodsi, Yinghui Huang, Heiga Zen, Pedro J. Moreno, Yu Zhang, Zhehuai Chen, Andrew Rosenberg, Jesse Emond, Gary Wang
Publikováno v:
Interspeech 2021.
Publikováno v:
ICASSP
We introduce asynchronous dynamic decoder, which adopts an efficient A* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly compos
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::dce12b7a29dfb3225c64dd37d5b4bfc3
Publikováno v:
INTERSPEECH
Publikováno v:
INTERSPEECH
Autor:
Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno, Yonghui Wu, Zhehuai Chen, Andrew Rosenberg, Gary Wang
Publikováno v:
ICASSP
Speech synthesis has advanced to the point of being close to indistinguishable from human speech. However, efforts to train speech recognition systems on synthesized utterances have not been able to show that synthesized data can be effectively used
Publikováno v:
Speech Communication. 102:100-111
Speech recognition is a sequence prediction problem. Besides employing various deep learning approaches for framelevel classification, sequence-level discriminative training has been proved to be indispensable to achieve the state-of-the-art performa