Lightweight Speaker Recognition in Poincaré Spaces

Autor: Seokhyeong Kang, Kim Sung-Bin, Tae-Hyun Oh, Jieun Lee
Rok vydání: 2022
Předmět:
Zdroj: IEEE Signal Processing Letters. 29:224-228
ISSN: 1558-2361
1070-9908
DOI: 10.1109/lsp.2021.3129695
Popis: This letter proposes a lightweight model for speaker recognition by leveraging a hyperbolic space. The speaker recognition performance heavily depends on the distinctiveness of speaker embeddings induced by metric learning. However, most state-of-the-art embedding methods are typically based on the Euclidean metric space, which does not account for inherent hierarchical structures of speech voice characteristics. The recent development of the neural hyperbolic geometry has demonstrated its effectiveness to model continuous hierarchical structures, which have been typically cumbersome to model by standard deep neural networks. This facet provides an additional by-product of a compact representation. Inspired by the favorable geometry of the hyperbolic geometry, we developed a hyperbolic ResNet for speaker recognition. We found that in smaller dimension regimes than typical cases, the learned speaker embeddings are more discriminative; in other words, more compact at the same level of performance. Our experiments on the large-scale VoxCeleb datasets show that, given the limited channel dimensions of neural networks, our method consistently has favorable performance against the standard ResNet for both speaker recognition and verification tasks.
Databáze: OpenAIRE