Zobrazeno 1 - 10
of 715
pro vyhledávání: '"P, Rouvier"'
Publikováno v:
Odyssey 2024, Jun 2024, Quebec, France
In this work, we detail our submission to the 2024 edition of the MSP-Podcast Speech Emotion Recognition (SER) Challenge. This challenge is divided into two distinct tasks: Categorical Emotion Recognition and Emotional Attribute Prediction. We concen
Externí odkaz:
http://arxiv.org/abs/2407.05746
Autor:
Ravanelli, Mirco, Parcollet, Titouan, Moumen, Adel, de Langen, Sylvain, Subakan, Cem, Plantinga, Peter, Wang, Yingzhi, Mousavi, Pooneh, Della Libera, Luca, Ploujnikov, Artem, Paissan, Francesco, Borra, Davide, Zaiem, Salah, Zhao, Zeyu, Zhang, Shucong, Karakasidis, Georgios, Yeh, Sung-Lin, Champion, Pierre, Rouhe, Aku, Braun, Rudolf, Mai, Florian, Zuluaga-Gomez, Juan, Mousavi, Seyed Mahed, Nautsch, Andreas, Nguyen, Ha, Liu, Xuechen, Sagar, Sangeet, Duret, Jarod, Mdhaffar, Salima, Laperriere, Gaelle, Rouvier, Mickael, De Mori, Renato, Esteve, Yannick
SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and
Externí odkaz:
http://arxiv.org/abs/2407.00463
Publikováno v:
InterSpeech 2024
In the rapidly evolving landscape of spoken question-answering (SQA), the integration of large language models (LLMs) has emerged as a transformative development. Conventional approaches often entail the use of separate models for question audio tran
Externí odkaz:
http://arxiv.org/abs/2406.05876
The SdSv challenge Task 2 provided an opportunity to assess efficiency and robustness of modern text-independent speaker verification systems. But it also made it possible to test new approaches, capable of taking into account the main issues of this
Externí odkaz:
http://arxiv.org/abs/2403.19634
Deep learning architectures have made significant progress in terms of performance in many research areas. The automatic speech recognition (ASR) field has thus benefited from these scientific and technological advances, particularly for acoustic mod
Externí odkaz:
http://arxiv.org/abs/2402.19443
Publikováno v:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Subword tokenization has become the prevailing standard in the field of natural language processing (NLP) over recent years, primarily due to the widespread utilization of pre-trained language models. This shift began with Byte-Pair Encoding (BPE) an
Externí odkaz:
http://arxiv.org/abs/2402.15010
Autor:
Labrak, Yanis, Bazoge, Adrien, Khettari, Oumaima El, Rouvier, Mickael, Beaufils, Pacome Constant dit, Grabar, Natalia, Daille, Beatrice, Quiniou, Solen, Morin, Emmanuel, Gourraud, Pierre-Antoine, Dufour, Richard
Publikováno v:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
The biomedical domain has sparked a significant interest in the field of Natural Language Processing (NLP), which has seen substantial advancements with pre-trained language models (PLMs). However, comparing these models has proven challenging due to
Externí odkaz:
http://arxiv.org/abs/2402.13432
Autor:
Labrak, Yanis, Bazoge, Adrien, Morin, Emmanuel, Gourraud, Pierre-Antoine, Rouvier, Mickael, Dufour, Richard
Publikováno v:
Proceedings of the 62st Annual Meeting of the Association for Computational Linguistics - Volume 1: Long Papers (ACL 2024)
Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, offering potential applications across specialized domains such as healthcare and medicine. Despite the availability of various open-source LLMs tailored for healt
Externí odkaz:
http://arxiv.org/abs/2402.10373
A new loss function for speaker recognition with deep neural network is proposed, based on Jeffreys Divergence. Adding this divergence to the cross-entropy loss function allows to maximize the target value of the output distribution while smoothing t
Externí odkaz:
http://arxiv.org/abs/2312.16885
Autor:
Miao, Xiaoxiao, Wang, Xin, Cooper, Erica, Yamagishi, Junichi, Evans, Nicholas, Todisco, Massimiliano, Bonastre, Jean-François, Rouvier, Mickael
The success of deep learning in speaker recognition relies heavily on the use of large datasets. However, the data-hungry nature of deep learning methods has already being questioned on account the ethical, privacy, and legal concerns that arise when
Externí odkaz:
http://arxiv.org/abs/2309.06141