Zobrazeno 1 - 10
of 78
pro vyhledávání: '"Badino, Leonardo"'
Publikováno v:
Proceedings of Interspeech 2024, pp. 3652--3653
We present a modular toolkit to perform joint speaker diarization and speaker identification. The toolkit can leverage on multiple models and algorithms which are defined in a configuration file. Such flexibility allows our system to work properly in
Externí odkaz:
http://arxiv.org/abs/2409.05750
This paper addresses spoken language identification (SLI) and speech recognition of multilingual broadcast and institutional speech, real application scenarios that have been rarely addressed in the SLI literature. Observing that in these domains lan
Externí odkaz:
http://arxiv.org/abs/2406.09290
Autor:
Turrisi, Rosanna, Badino, Leonardo
Publikováno v:
Proc. Interspeech 2022
This work addresses the mismatch problem between the distribution of training data (source) and testing data (target), in the challenging context of dysarthric speech recognition. We focus on Speaker Adaptation (SA) in command speech recognition, whe
Externí odkaz:
http://arxiv.org/abs/2203.07143
Autor:
Turrisi, Rosanna, Badino, Leonardo
In many real-world applications, the mismatch between distributions of training data (source) and test data (target) significantly degrades the performance of machine learning algorithms. In speech data, causes of this mismatch include different acou
Externí odkaz:
http://arxiv.org/abs/2104.02535
Autor:
Turrisi, Rosanna, Braccia, Arianna, Emanuele, Marco, Giulietti, Simone, Pugliatti, Maura, Sensi, Mariachiara, Fadiga, Luciano, Badino, Leonardo
Publikováno v:
Interspeech 2021
This paper introduces a new dysarthric speech command dataset in Italian, called EasyCall corpus. The dataset consists of 21386 audio recordings from 24 healthy and 31 dysarthric speakers, whose individual degree of speech impairment was assessed by
Externí odkaz:
http://arxiv.org/abs/2104.02542
We propose a method to address audio-visual target speaker enhancement in multi-talker environments using event-driven cameras. State of the art audio-visual speech separation methods shows that crucial information is the movement of the facial landm
Externí odkaz:
http://arxiv.org/abs/1912.02671
In this paper, we analyzed how audio-visual speech enhancement can help to perform the ASR task in a cocktail party scenario. Therefore we considered two simple end-to-end LSTM-based models that perform single-channel audio-visual speech enhancement
Externí odkaz:
http://arxiv.org/abs/1904.08248
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
Autor:
Morrone, Giovanni, Pasa, Luca, Tikhanoff, Vadim, Bergamaschi, Sonia, Fadiga, Luciano, Badino, Leonardo
In this paper, we address the problem of enhancing the speech of a speaker of interest in a cocktail party scenario when visual information of the speaker of interest is available. Contrary to most previous studies, we do not learn visual features on
Externí odkaz:
http://arxiv.org/abs/1811.02480
Publikováno v:
2018 IEEE Spoken Language Technology Workshop (SLT)
We address the problem of reconstructing articulatory movements, given audio and/or phonetic labels. The scarce availability of multi-speaker articulatory data makes it difficult to learn a reconstruction that generalizes to new speakers and across d
Externí odkaz:
http://arxiv.org/abs/1809.00938
Autor:
Badino, Leonardo
This thesis proposes to improve and enrich the expressiveness of English Text-to-Speech (TTS) synthesis by identifying and generating natural patterns of prosodic prominence. In most state-of-the-art TTS systems the prediction from text of prosodic p
Externí odkaz:
http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.563044