Zobrazeno 1 - 10
of 90
pro vyhledávání: '"Amparo Varona"'
Publikováno v:
Applied Sciences, Vol 14, Iss 5, p 1951 (2024)
The development of speech technology requires large amounts of data to estimate the underlying models. Even when relying on large multilingual pre-trained models, some amount of task-specific data on the target language is needed to fine-tune those m
Externí odkaz:
https://doaj.org/article/b728311dc6f649c7b17ba4a86578c55f
Autor:
Eduardo Lleida, Luis Javier Rodriguez-Fuentes, Javier Tejedor, Alfonso Ortega, Antonio Miguel, Virginia Bazán, Carmen Pérez, Alberto de Prada, Mikel Penagarikano, Amparo Varona, Germán Bordel, Doroteo Torre-Toledano, Aitor Álvarez, Haritz Arzelus
Publikováno v:
Applied Sciences, Vol 13, Iss 15, p 8577 (2023)
Evaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part o
Externí odkaz:
https://doaj.org/article/75d032e326114ebca601ea06a1a3cd04
Publikováno v:
Applied Sciences, Vol 13, Iss 14, p 8492 (2023)
In this paper, a semisupervised speech data extraction method is presented and applied to create a new dataset designed for the development of fully bilingual Automatic Speech Recognition (ASR) systems for Basque and Spanish. The dataset is drawn fro
Externí odkaz:
https://doaj.org/article/e8c099db1d79434f9d1b16b7d1d9094f
Publikováno v:
IberSPEECH 2022.
Publikováno v:
IberSPEECH 2022.
Publikováno v:
IberSPEECH
Autor:
Luis Javier Rodríguez-Fuentes, Amparo Varona, Aitor Alvarez, Germán Bordel, Mikel Peñagarikano
Publikováno v:
IEEE Signal Processing Letters. 23:126-129
The synchronization of text transcripts with audio tracks is typically solved by forced alignment at the phonetic level. However, when dealing with either very long audio tracks or acoustically inaccurate text transcripts, more complex methods are ne
Publikováno v:
Language Resources and Evaluation. 50:221-243
KALAKA-3 is a speech database specifically designed for the development and evaluation of Spoken Language Recognition (SLR) systems. The database provides TV broadcast speech for training, and audio data extracted from YouTube videos for tuning and t
Publikováno v:
IEEE Signal Processing Letters. 21:1073-1077
The so called Phone Log-Likelihood Ratio (PLLR) features have been recently introduced as a novel and effective way of retrieving acoustic-phonetic information in spoken language and speaker recognition systems. In this letter, an in-depth insight in
Publikováno v:
Procedia - Social and Behavioral Sciences. 141:961-968
In the European higher education landscape, Dublin descriptors indicate that qualifications are awarded to students who have demonstrated knowledge and understanding in a field and can apply it in a “professional” way. In this context, “profess