Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Miquel India"'
Publikováno v:
Applied Sciences, Vol 13, Iss 11, p 6410 (2023)
Recently, there has been a significant surge of interest in Self-Attention Networks (SANs) based on the Transformer architecture. This can be attributed to their notable ability for parallelization and their impressive performance across various Natu
Externí odkaz:
https://doaj.org/article/d91434e596a449c4a466d9b2589b7f0e
State-of-the-art Deep Learning systems for speaker verification are commonly based on speaker embedding extractors. These architectures are usually composed of a feature extractor front-end together with a pooling layer to encode variable length utte
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5b8b201cf2a658d62d21ca37362b74a1
https://hdl.handle.net/2117/384802
https://hdl.handle.net/2117/384802
Publikováno v:
Computer Speech & Language. 78:101441
The aim of this paper is to investigate the benefit of combining both language and acoustic modelling for speaker diarization. Although conventional systems only use acoustic features, in some scenarios linguistic data contain high discriminative spe
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
ICASSP
Universitat Politècnica de Catalunya (UPC)
ICASSP
Most state-of-the-art Deep Learning systems for text-independent speaker verification are based on speaker embedding extractors. These architectures are commonly composed of a feature extractor front-end together with a pooling layer to encode variab
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3e1c759f0c2f9616c687d61dd14c7dd4
http://arxiv.org/abs/2007.13199
http://arxiv.org/abs/2007.13199
Publikováno v:
ICASSP
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Probabilistic Linear Discriminant Analysis (PLDA) is the most efficient backend for i-vectors. However, it requires labeled background data which can be difficult to access in practice. Unlike PLDA, cosine scoring avoids speaker-labels at the cost of
Publikováno v:
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
INTERSPEECH
Universitat Politècnica de Catalunya (UPC)
INTERSPEECH
The computing power of mobile devices limits the end-user applications in terms of storage size, processing, memory and energy consumption. These limitations motivate researchers for the design of more efficient deep models. On the other hand, self-a
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1b0d84dddab9d30671ab1730e5797671
Publikováno v:
INTERSPEECH
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
In the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap between the two scoringtechniq
Publikováno v:
INTERSPEECH
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Most state-of-the-art Deep Learning (DL) approaches for speaker recognition work on a short utterance level. Given the speech signal, these algorithms extract a sequence of speaker embeddings from short segments and those are averaged to obtain an ut
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bff89a7fa3b71b88b960a1454954a90f
https://hdl.handle.net/2117/178623
https://hdl.handle.net/2117/178623
Publikováno v:
INTERSPEECH
This paper presents a new speaker change detection system based on Long Short-Term Memory (LSTM) neural networks using acoustic data and linguistic content. Language modelling is combined with two different Joint Factor Analysis (JFA) acoustic approa
Autor:
Jean-Marc Odobez, Guillaume Gravier, Carmen García-Mateo, Izabela Lyon Freire, Paula Lopez-Otero, Silvio Jamil Ferzoli Guimarães, Hervé Bredin, Claude Barras, Miquel India, Camille Guinaudeau, Gabriel Sargent, Zenilton Kleber Gonçalves do Patrocínio, Nam Le, Gerard Martí, Josep Ramon Morros, Laura Docio-Fernandez, Gabriel Barbosa da Fonseca, Javier Hernando, Sylvain Meignier
Publikováno v:
Recercat. Dipósit de la Recerca de Catalunya
instname
Content-Based Multimedia Indexing CBMI
Content-Based Multimedia Indexing CBMI, Jun 2017, Firenze, Italy. ⟨10.1145/3095713.3095732⟩
CBMI
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
instname
Content-Based Multimedia Indexing CBMI
Content-Based Multimedia Indexing CBMI, Jun 2017, Firenze, Italy. ⟨10.1145/3095713.3095732⟩
CBMI
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Comunicació presentada a: the 15th International Workshop on Content-Based Multimedia Indexing (CBMI'17), celebrat a Florència, Itàlia, del 19 al 21 de juny de 2017 The rapid growth of multimedia databases and the human interest in their peers mak
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9bb8589dcb99a98a7a0cf1e20dd353b2
http://hdl.handle.net/2117/112283
http://hdl.handle.net/2117/112283