Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Katta, Sandesh V"'
We introduce DECAR, a self-supervised pre-training approach for learning general-purpose audio representations. Our system is based on clustering: it utilizes an offline clustering step to provide target labels that act as pseudo-labels for solving a
Externí odkaz:
http://arxiv.org/abs/2110.08895
One of the most popular speaker embeddings is x-vectors, which are obtained from an architecture that gradually builds a larger temporal context with layers. In this paper, we propose to derive speaker embeddings from Transformer's encoder trained fo
Externí odkaz:
http://arxiv.org/abs/2008.04659