Výsledky vyhledávání - "Miquel India"

Akademický článek

Self Attention Networks in Speaker Recognition

Autor: Pooyan Safari, Miquel India, Javier Hernando

Publikováno v: Applied Sciences, Vol 13, Iss 11, p 6410 (2023)

Recently, there has been a significant surge of interest in Self-Attention Networks (SANs) based on the Transformer architecture. This can be attributed to their notable ability for parallelization and their impressive performance across various Natu

Externí odkaz: https://doaj.org/article/d91434e596a449c4a466d9b2589b7f0e

Zobrazit plný text záznamu

Speaker characterization by means of attention pooling

Autor: Federico Costa, Miquel India, Javier Hernando

State-of-the-art Deep Learning systems for speaker verification are commonly based on speaker embedding extractors. These architectures are usually composed of a feature extractor front-end together with a pooling layer to encode variable length utte

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5b8b201cf2a658d62d21ca37362b74a1
https://hdl.handle.net/2117/384802

Zobrazit plný text záznamu

Language modelling for speaker diarization in telephonic interviews

Autor: Miquel India, Javier Hernando, José A.R. Fonollosa

Publikováno v: Computer Speech & Language. 78:101441

The aim of this paper is to investigate the benefit of combining both language and acoustic modelling for speaker diarization. Although conventional systems only use acoustic features, in some scenarios linguistic data contain high discriminative spe

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7ba8b47f9fc9f61c4919b3e6fd552a35
https://doi.org/10.1016/j.csl.2022.101441

Zobrazit plný text záznamu

Double Multi-Head Attention for Speaker Verification

Autor: Miquel India, Javier Hernando, Pooyan Safari

Publikováno v: UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
ICASSP

Most state-of-the-art Deep Learning systems for text-independent speaker verification are based on speaker embedding extractors. These architectures are commonly composed of a feature extractor front-end together with a pooling layer to encode variab

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3e1c759f0c2f9616c687d61dd14c7dd4
http://arxiv.org/abs/2007.13199

Zobrazit plný text záznamu

I-Vector Transformation Using K-Nearest Neighbors for Speaker Verification

Autor: Umair Khan, Javier Hernando, Miquel India

Publikováno v: ICASSP
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)

Probabilistic Linear Discriminant Analysis (PLDA) is the most efficient backend for i-vectors. However, it requires labeled background data which can be difficult to access in practice. Unlike PLDA, cosine scoring avoids speaker-labels at the cost of

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f9a3af2397d3c20d9e6b5d368db72987
https://doi.org/10.1109/icassp40776.2020.9053504

Zobrazit plný text záznamu

Self-attention encoding and pooling for speaker recognition

Autor: Pooyan Safari, Javier Hernando, Miquel India

Publikováno v: UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
INTERSPEECH

The computing power of mobile devices limits the end-user applications in terms of storage size, processing, memory and energy consumption. These limitations motivate researchers for the design of more efficient deep models. On the other hand, self-a

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1b0d84dddab9d30671ab1730e5797671

Zobrazit plný text záznamu

Auto-Encoding Nearest Neighbor i-Vectors for Speaker Verification

Autor: Javier Hernando, Umair Khan, Miquel India

Publikováno v: INTERSPEECH
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)

In the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap between the two scoringtechniq

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ae202e4f2d023a6e64db06a60a8974e7
https://doi.org/10.21437/interspeech.2019-1444

Zobrazit plný text záznamu

Self multi-head attention for speaker recognition

Autor: Javier Hernando, Miquel India, Pooyan Safari

Publikováno v: INTERSPEECH
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)

Most state-of-the-art Deep Learning (DL) approaches for speaker recognition work on a short utterance level. Given the speech signal, these algorithms extract a sequence of speaker embeddings from short segments and those are averaged to obtain an ut

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bff89a7fa3b71b88b960a1454954a90f
https://hdl.handle.net/2117/178623

Zobrazit plný text záznamu

LSTM Neural Network-Based Speaker Segmentation Using Acoustic and Language Modelling

Autor: Javier Hernando, Miquel India, José A. R. Fonollosa

Publikováno v: INTERSPEECH

This paper presents a new speaker change detection system based on Long Short-Term Memory (LSTM) neural networks using acoustic data and linguistic content. Language modelling is combined with two different Joint Factor Analysis (JFA) acoustic approa

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::f432959e9fe5d84c8cd39b0aed30c5ec
https://doi.org/10.21437/interspeech.2017-407

Zobrazit plný text záznamu

Towards large scale multimedia indexing: a case study on person discovery in broadcast news

Autor: Jean-Marc Odobez, Guillaume Gravier, Carmen García-Mateo, Izabela Lyon Freire, Paula Lopez-Otero, Silvio Jamil Ferzoli Guimarães, Hervé Bredin, Claude Barras, Miquel India, Camille Guinaudeau, Gabriel Sargent, Zenilton Kleber Gonçalves do Patrocínio, Nam Le, Gerard Martí, Josep Ramon Morros, Laura Docio-Fernandez, Gabriel Barbosa da Fonseca, Javier Hernando, Sylvain Meignier

Publikováno v: Recercat. Dipósit de la Recerca de Catalunya
instname
Content-Based Multimedia Indexing CBMI
Content-Based Multimedia Indexing CBMI, Jun 2017, Firenze, Italy. ⟨10.1145/3095713.3095732⟩
CBMI
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)

Comunicació presentada a: the 15th International Workshop on Content-Based Multimedia Indexing (CBMI'17), celebrat a Florència, Itàlia, del 19 al 21 de juny de 2017 The rapid growth of multimedia databases and the human interest in their peers mak

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9bb8589dcb99a98a7a0cf1e20dd353b2
http://hdl.handle.net/2117/112283

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání