Výsledky vyhledávání

Akademický článek

Approximate Techniques in Solving Optimal Camera Placement Problems

Autor: Jian Zhao, Ruriko Yoshida, Sen-ching Samson Cheung, David Haws

Publikováno v: International Journal of Distributed Sensor Networks, Vol 9 (2013)

While the theoretical foundation of the optimal camera placement problem has been studied for decades, its practical implementation has recently attracted significant research interest due to the increasing popularity of visual sensor networks. The m

Externí odkaz: https://doaj.org/article/0d50f39fbc48416b946df122b66d8e0c

Zobrazit plný text záznamu

Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis

Autor: Raul Fernandez, David Haws, Guy Lorberbom, Slava Shechtman, Alexander Sorin

Sequence-to-Sequence Text-to-Speech architectures that directly generate low level acoustic features from phonetic sequences are known to produce natural and expressive speech when provided with adequate amounts of training data. Such systems can lea

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9b5070089b06e24f85f0fc0cddec9a6a
http://arxiv.org/abs/2207.12262

Zobrazit plný text záznamu

VQ-T: RNN Transducers using Vector-Quantized Prediction Network States

Autor: Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury

Beam search, which is the dominant ASR decoding algorithm for end-to-end models, generates tree-structured hypotheses. However, recent studies have shown that decoding with hypothesis merging can achieve a more efficient search with comparable or bet

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f5508e842bdd7f079e8bc6b494339d37

Zobrazit plný text záznamu

Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio

Autor: Zoltán Tüske, Brian Kingsbury, George Saon, Gakuto Kurata, David Haws

Publikováno v: Interspeech 2021.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::fec0a2676550ca73ff5707b4d9228d09
https://doi.org/10.21437/interspeech.2021-1656

Zobrazit plný text záznamu

Reducing Exposure Bias in Training Recurrent Neural Network Transducers

Autor: Xiaodong Cui, Brian Kingsbury, George Saon, Zoltán Tüske, David Haws

When recurrent neural network transducers (RNNTs) are trained using the typical maximum likelihood criterion, the prediction network is trained only on ground truth label sequences. This leads to a mismatch during inference, known as exposure bias, w

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::43552ddc9234b56d1b9945330f610fea
http://arxiv.org/abs/2108.10803

Zobrazit plný text záznamu

Stable Checkpoint Selection and Evaluation in Sequence to Sequence Speech Synthesis

Autor: Slava Shechtman, Raul Fernandez, David Haws

Publikováno v: ICASSP

Autoregressive Attentive Sequence-to-Sequence (S2S) speech synthesis is considered state-of-the-art in terms of speech quality and naturalness, as evaluated on a finite set of testing utterances. However, it can occasionally suffer from stability iss

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::e100512107ef39ee55ea5043f7cdaa69
https://doi.org/10.1109/icassp39728.2021.9414402

Zobrazit plný text záznamu

Supervised and Unsupervised Approaches for Controlling Narrow Lexical Focus in Sequence-to-Sequence Speech Synthesis

Autor: David Haws, Slava Shechtman, Raul Fernandez

Publikováno v: SLT

Although Sequence-to-Sequence (S2S) architectures have become state-of-the-art in speech synthesis, capable of generating outputs that approach the perceptual quality of natural samples, they are limited by a lack of flexibility when it comes to cont

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2f7e4a7a83c0d7377009900a3d3b3598
http://arxiv.org/abs/2101.09940

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání