Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Joanna Rownicka"'
Publikováno v:
Odyssey
We aim to characterize how different speakers contribute to the perceived output quality of multi-speaker Text-to-Speech (TTS) synthesis. We automatically rate the quality of TTS using a neural network (NN) trained on human mean opinion score (MOS) r
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::69a879477e66365c2d40e4b3177f0257
http://arxiv.org/abs/2002.12645
http://arxiv.org/abs/2002.12645
Publikováno v:
ASRU
Równicka, J, Bell, P & Renals, S 2020, Embeddings for DNN speaker adaptive training . in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) . Institute of Electrical and Electronics Engineers (IEEE), pp. 479-486, IEEE Automatic Speech Recognition and Understanding Workshop 2019, Sentosa, Singapore, 14/12/19 . https://doi.org/10.1109/ASRU46091.2019.9004028
Równicka, J, Bell, P & Renals, S 2020, Embeddings for DNN speaker adaptive training . in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) . Institute of Electrical and Electronics Engineers (IEEE), pp. 479-486, IEEE Automatic Speech Recognition and Understanding Workshop 2019, Sentosa, Singapore, 14/12/19 . https://doi.org/10.1109/ASRU46091.2019.9004028
In this work, we investigate the use of embeddings for speaker-adaptive training of DNNs (DNN-SAT) focusing on a small amount of adaptation data per speaker. DNN-SAT can be viewed as learning a mapping from each embedding to transformation parameters
Publikováno v:
ICASSP
Równicka, J, Bell, P & Renals, S 2020, Multi-Scale Octave Convolutions for Robust Speech Recognition . in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Institute of Electrical and Electronics Engineers (IEEE), pp. 7019-7023, 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, Barcelona, Spain, 4/05/20 . https://doi.org/10.1109/ICASSP40776.2020.9053703
Równicka, J, Bell, P & Renals, S 2020, Multi-Scale Octave Convolutions for Robust Speech Recognition . in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Institute of Electrical and Electronics Engineers (IEEE), pp. 7019-7023, 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, Barcelona, Spain, 4/05/20 . https://doi.org/10.1109/ICASSP40776.2020.9053703
We propose a multi-scale octave convolution layer to learn robust speech representations efficiently. Octave convolutions were introduced by Chen et al [1] in the computer vision field to reduce the spatial redundancy of the feature maps by decomposi
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7fae2ea134df7c44cdcf3ae4aea911de
http://arxiv.org/abs/1910.14443
http://arxiv.org/abs/1910.14443
Autor:
Jennifer Williams, Joanna Rownicka
Publikováno v:
INTERSPEECH
We present our system submission to the ASVspoof 2019 Challenge Physical Access (PA) task. The objective for this challenge was to develop a countermeasure that identifies speech audio as either bona fide or intercepted and replayed. The target predi
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0e0071e460b2d4dc63473a121ec8bb5d
http://arxiv.org/abs/1909.10324
http://arxiv.org/abs/1909.10324
Publikováno v:
SLT
Równicka, J, Bell, P & Renals, S 2019, Analyzing deep CNN-based utterance embeddings for acoustic model adaptation . in 2018 IEEE Spoken Language Technology Workshop (SLT) . Institute of Electrical and Electronics Engineers (IEEE), pp. 235-241, 2018 IEEE Workshop on Spoken Language Technology (SLT), Athens, Greece, 18/12/18 . https://doi.org/10.1109/SLT.2018.8639036
Równicka, J, Bell, P & Renals, S 2019, Analyzing deep CNN-based utterance embeddings for acoustic model adaptation . in 2018 IEEE Spoken Language Technology Workshop (SLT) . Institute of Electrical and Electronics Engineers (IEEE), pp. 235-241, 2018 IEEE Workshop on Spoken Language Technology (SLT), Athens, Greece, 18/12/18 . https://doi.org/10.1109/SLT.2018.8639036
We explore why deep convolutional neural networks (CNNs) with small two-dimensional kernels, primarily used for modeling spatial relations in images, are also effective in speech recognition. We analyze the representations learned by deep CNNs and co
Publikováno v:
Rownicka, J, Renals, S & Bell, P 2018, Simplifying very deep convolutional neural network architectures for robust speech recognition . in IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2017) . Institute of Electrical and Electronics Engineers (IEEE), pp. 236-243, 2017 IEEE Automatic Speech Recognition and Understanding Workshop, Okinawa, Japan, 16/12/17 . https://doi.org/10.1109/ASRU.2017.8268941
ASRU
ASRU
Very deep convolutional neural networks (VDCNNs) have been successfully used in computer vision. More recently VDCNNs have been applied to speech recognition, using architectures adopted from computer vision. In this paper, we experimentally analyse
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::dbe5be76e18ff8aa022a212c92a2aef0
https://www.pure.ed.ac.uk/ws/files/44898955/rownicka_asru17.pdf
https://www.pure.ed.ac.uk/ws/files/44898955/rownicka_asru17.pdf