Zobrazeno 1 - 10
of 39
pro vyhledávání: '"Mickael Rouvier"'
Publikováno v:
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2010 (2010)
Spoken utterance retrieval was largely studied in the last decades, with the purpose of indexing large audio databases or of detecting keywords in continuous speech streams. While the indexing of closed corpora can be performed via a batch process, o
Externí odkaz:
https://doaj.org/article/a8f21253182c4ac88471e7199e30dcf7
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Autor:
Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Béatrice Daille, Pierre-Antoine Gourraud
Publikováno v:
Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL'23)
Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL'23), Jul 2023, Toronto (CA), Canada
Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL'23), Jul 2023, Toronto (CA), Canada
In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks. While the first models were trained on general domain data, specialized ones have emerged to more effectively
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::75f51159d229dd720adbf19716c849e0
Publikováno v:
Proceeedings Interspeech 2022
Interspeech
Interspeech, Sep 2022, Incheon, South Korea
Interspeech
Interspeech, Sep 2022, Incheon, South Korea
International audience; Evaluating automatic speech recognition (ASR) systems is a classical but difficult and still open problem, which often boils down to focusing only on the word error rate (WER). However, this metric suffers from many limitation
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2fc81f48128ac9bf43290134f128d41f
https://hal.science/hal-03712735v2/document
https://hal.science/hal-03712735v2/document
Publikováno v:
Interspeech 2022.
In this paper we examine the use of semantically-aligned speech representations for end-to-end spoken language understanding (SLU). We employ the recently-introduced SAMU-XLSR model, which is designed to generate a single embedding that captures the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f37adea6ef4b140272ab43dde364388c
Publikováno v:
23rd International Conference on Speech and Computer (SPECOM)
23rd International Conference on Speech and Computer (SPECOM), Sep 2021, Saint Petersburg, Russia
Speech and Computer ISBN: 9783030878016
SPECOM
23rd International Conference on Speech and Computer (SPECOM), Sep 2021, Saint Petersburg, Russia
Speech and Computer ISBN: 9783030878016
SPECOM
International audience; Finding professional voice-actors for cultural productions is performed by a human operator and suffers from several difficulties. Researchers have therefore been interested for several years in mimicking the process of vocal
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0fc7a917299639abefa9fb7060f779dc
https://hal.archives-ouvertes.fr/hal-03348578
https://hal.archives-ouvertes.fr/hal-03348578
The x-vector architecture has recently achieved state-of-the-art results on the speaker verification task. This architecture incorporates a central layer, referred to as temporal pooling, which stacks statistical parameters of the acoustic frame dist
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::45b29e53bf34cabea0513d9db2f38248
http://arxiv.org/abs/2105.04310
http://arxiv.org/abs/2105.04310
Publikováno v:
EUSIPCO
Recently, the x-vector framework, extracted with deep neural network architectures, became the state-of-the-art method for speaker verification. Although another level of performance has been overcome with this approach, fine-tuning and optimizing th
Publikováno v:
Speech and Computer ISBN: 9783030878016
SPECOM
SPECOM
In this article we propose to study several approaches to adapt a system between two languages. To train the state of the art x-vector Speaker Verification system, we need a huge amount of labeled speech data. If this constraint is satisfied in Engli
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::e0a7d4d1bb32a29febf90f3b957508be
https://doi.org/10.1007/978-3-030-87802-3_9
https://doi.org/10.1007/978-3-030-87802-3_9