Zobrazeno 1 - 10
of 33
pro vyhledávání: '"Alexandra Markó"'
Publikováno v:
Sensors, Vol 22, Iss 22, p 8601 (2022)
Within speech processing, articulatory-to-acoustic mapping (AAM) methods can apply ultrasound tongue imaging (UTI) as an input. (Micro)convex transducers are mostly used, which provide a wedge-shape visual image. However, this process is optimized fo
Externí odkaz:
https://doaj.org/article/80c85c4f4ec340d3a7ddbec6dd86472c
Publikováno v:
Magyar Nyelv. 117:38-50
Publikováno v:
Interspeech 2021.
Articulatory-to-acoustic mapping seeks to reconstruct speech from a recording of the articulatory movements, for example, an ultrasound video. Just like speech signals, these recordings represent not only the linguistic content, but are also highly s
Autor:
Géza Németh, Tamás Gábor Csapó, Gábor Gosztolya, Csaba Zainkó, Amin Honarmandi Shandiz, László Tóth, Alexandra Markó
For articulatory-to-acoustic mapping, typically only limited parallel training data is available, making it impossible to apply fully end-to-end solutions like Tacotron2. In this paper, we experimented with transfer learning and adaptation of a Tacot
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5b350c60f09adc9ed7e9f44e52d333a5
http://arxiv.org/abs/2107.12051
http://arxiv.org/abs/2107.12051
Articulatory information has been shown to be effective in improving the performance of HMM-based and DNN-based text-to-speech synthesis. Speech synthesis research focuses traditionally on text-to-speech conversion, when the input is text or an estim
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::93d3f0061c81ebbc09bcf1b9c2d6134f
http://arxiv.org/abs/2107.02003
http://arxiv.org/abs/2107.02003
Publikováno v:
Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2021) ISBN: 9783030763459
AICV
AICV
Besides the well-known classification task, these days neural networks are frequently being applied to generate or transform data, such as images and audio signals. In such tasks, the conventional loss functions like the mean squared error (MSE) may
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::853ab4f369ba19681313cf5575b84425
https://doi.org/10.1007/978-3-030-76346-6_39
https://doi.org/10.1007/978-3-030-76346-6_39
Autor:
Alexandra Markó, Zsuzsa C. Vladár
Publikováno v:
MAGYAR NYELV MAGYAR NYELV.
Publikováno v:
INTERSPEECH
For articulatory-to-acoustic mapping using deep neural networks, typically spectral and excitation parameters of vocoders have been used as the training targets. However, vocoding often results in buzzy and muffled final speech quality. Therefore, in
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::614b451b99080bcca1af18d70326daee
http://arxiv.org/abs/2008.03152
http://arxiv.org/abs/2008.03152
Publikováno v:
Approaches to Hungarian ISBN: 9789027204905
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::947cfe9f17a4d004b5cdf8c0ebf9a1af
https://doi.org/10.1075/atoh.16.03dem
https://doi.org/10.1075/atoh.16.03dem
Publikováno v:
ACTA POLYTECH HUNG ACTA POLYTECHNICA HUNGARICA.
Silent Speech Interfaces (SSI) perform articulatory-to-acoustic mapping to convert articulatory movement into synthesized speech. Its main goal is to aid the speech handicapped, or to be used as a part of a communication system operating in silence-r