Zobrazeno 1 - 10
of 173
pro vyhledávání: '"Eduardo Lleida"'
Autor:
Dayana Ribas, Miguel A. Pastor, Antonio Miguel, David Martinez, Alfonso Ortega, Eduardo Lleida
Publikováno v:
IEEE Access, Vol 11, Pp 14915-14927 (2023)
Many speech features and models, including Deep Neural Networks (DNN), are used for classification tasks between healthy and pathological speech with the Saarbruecken Voice Database (SVD). However, accuracy values of 80.71% for phrases or 82.8% for v
Externí odkaz:
https://doaj.org/article/53616cb4fc3a49809ecae4af0f0a370a
Publikováno v:
Applied Sciences, Vol 13, Iss 16, p 9062 (2023)
Speech Emotion Recognition (SER) plays a crucial role in applications involving human-machine interaction. However, the scarcity of suitable emotional speech datasets presents a major challenge for accurate SER systems. Deep Neural Network (DNN)-base
Externí odkaz:
https://doaj.org/article/cb6bcbf6b1204763841ed65533408409
Autor:
Eduardo Lleida, Luis Javier Rodriguez-Fuentes, Javier Tejedor, Alfonso Ortega, Antonio Miguel, Virginia Bazán, Carmen Pérez, Alberto de Prada, Mikel Penagarikano, Amparo Varona, Germán Bordel, Doroteo Torre-Toledano, Aitor Álvarez, Haritz Arzelus
Publikováno v:
Applied Sciences, Vol 13, Iss 15, p 8577 (2023)
Evaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part o
Externí odkaz:
https://doaj.org/article/75d032e326114ebca601ea06a1a3cd04
Publikováno v:
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2021, Iss 1, Pp 1-16 (2021)
Abstract The progressive paradigm is a promising strategy to optimize network performance for speech enhancement purposes. Recent works have shown different strategies to improve the accuracy of speech enhancement solutions based on this mechanism. T
Externí odkaz:
https://doaj.org/article/9f566a13be174492bb726d8cc73bb7ca
Publikováno v:
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2020, Iss 1, Pp 1-19 (2020)
Abstract This paper presents a new approach based on recurrent neural networks (RNN) to the multiclass audio segmentation task whose goal is to classify an audio signal as speech, music, noise or a combination of these. The proposed system is based o
Externí odkaz:
https://doaj.org/article/ca6492b87b674fc983c5b19bc6e94cd8
Publikováno v:
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2019, Iss 1, Pp 1-13 (2019)
Abstract We present a novel model adaptation approach to deal with data variability for speaker diarization in a broadcast environment. Expensive human annotated data can be used to mitigate the domain mismatch by means of supervised model adaptation
Externí odkaz:
https://doaj.org/article/ba31c624d7fb4134ace03b2bac63050e
Publikováno v:
Applied Sciences, Vol 12, Iss 18, p 9000 (2022)
This paper proposes a Deep Learning (DL) based Wiener filter estimator for speech enhancement in the framework of the classical spectral-domain speech estimator algorithm. According to the characteristics of the intermediate steps of the speech enhan
Externí odkaz:
https://doaj.org/article/1c1f63f5c7dd47028e08c58474cfb522
Publikováno v:
Applied Sciences, Vol 12, Iss 4, p 1832 (2022)
Speech Activity Detection (SAD) aims to accurately classify audio fragments containing human speech. Current state-of-the-art systems for the SAD task are mainly based on deep learning solutions. These applications usually show a significant drop in
Externí odkaz:
https://doaj.org/article/9f01d82a82ac4dfe9e17d99239ed91e7
Autor:
Victoria Mingote, Ignacio Viñals, Pablo Gimeno, Antonio Miguel, Alfonso Ortega, Eduardo Lleida
Publikováno v:
Applied Sciences, Vol 12, Iss 3, p 1141 (2022)
This paper describes a post-evaluation analysis of the system developed by ViVoLAB research group for the IberSPEECH-RTVE 2020 Multimodal Diarization (MD) Challenge. This challenge focuses on the study of multimodal systems for the diarization of aud
Externí odkaz:
https://doaj.org/article/de4ff4fbaa634490a1ee4bcc1eeec40c
Publikováno v:
Applied Sciences, Vol 11, Iss 18, p 8521 (2021)
The demand of high-quality metadata for the available multimedia content requires the development of new techniques able to correctly identify more and more information, including the speaker information. The task known as speaker attribution aims at
Externí odkaz:
https://doaj.org/article/932eaf5cbc24436d8227da05fa3a5bfb