Zobrazeno 1 - 10
of 60
pro vyhledávání: '"Plchot, Oldrich"'
Autor:
Barahona, Sara, Mošner, Ladislav, Stafylakis, Themos, Plchot, Oldřich, Peng, Junyi, Burget, Lukáš, Černocký, Jan
In this paper, we refine and validate our method for training speaker embedding extractors using weak annotations. More specifically, we use only the audio stream of the source VoxCeleb videos and the names of the celebrities without knowing the time
Externí odkaz:
http://arxiv.org/abs/2410.02364
Autor:
Peng, Junyi, Mošner, Ladislav, Zhang, Lin, Plchot, Oldřich, Stafylakis, Themos, Burget, Lukáš, Černocký, Jan
Self-supervised learning (SSL) models for speaker verification (SV) have gained significant attention in recent years. However, existing SSL-based SV systems often struggle to capture local temporal dependencies and generalize across different tasks.
Externí odkaz:
http://arxiv.org/abs/2409.15234
Autor:
Rohdin, Johan, Zhang, Lin, Plchot, Oldřich, Staněk, Vojtěch, Mihola, David, Peng, Junyi, Stafylakis, Themos, Beveraki, Dmitriy, Silnova, Anna, Brukner, Jan, Burget, Lukáš
This paper describes the BUT submitted systems for the ASVspoof 5 challenge, along with analyses. For the conventional deepfake detection task, we use ResNet18 and self-supervised models for the closed and open conditions, respectively. In addition,
Externí odkaz:
http://arxiv.org/abs/2408.11152
Speaker embedding extractors are typically trained using a classification loss over the training speakers. During the last few years, the standard softmax/cross-entropy loss has been replaced by the margin-based losses, yielding significant improveme
Externí odkaz:
http://arxiv.org/abs/2406.12622
Autor:
Peng, Junyi, Delcroix, Marc, Ochiai, Tsubasa, Plchot, Oldrich, Ashihara, Takanori, Araki, Shoko, Cernocky, Jan
Large-scale pre-trained self-supervised learning (SSL) models have shown remarkable advancements in speech-related tasks. However, the utilization of these models in complex multi-talker scenarios, such as extracting a target speaker in a mixture, is
Externí odkaz:
http://arxiv.org/abs/2402.13200
Pre-trained self-supervised learning (SSL) models have achieved remarkable success in various speech tasks. However, their potential in target speech extraction (TSE) has not been fully exploited. TSE aims to extract the speech of a target speaker in
Externí odkaz:
http://arxiv.org/abs/2402.13199
Autor:
Peng, Junyi, Plchot, Oldřich, Stafylakis, Themos, Mošner, Ladislav, Burget, Lukáš, Černocký, Jan
Recently, fine-tuning large pre-trained Transformer models using downstream datasets has received a rising interest. Despite their success, it is still challenging to disentangle the benefits of large-scale datasets and Transformer structures from th
Externí odkaz:
http://arxiv.org/abs/2305.10517
Autor:
Peng, Junyi, Stafylakis, Themos, Gu, Rongzhi, Plchot, Oldřich, Mošner, Ladislav, Burget, Lukáš, Černocký, Jan
Recently, the pre-trained Transformer models have received a rising interest in the field of speech processing thanks to their great success in various downstream tasks. However, most fine-tuning approaches update all the parameters of the pre-traine
Externí odkaz:
http://arxiv.org/abs/2210.16032
Autor:
Stafylakis, Themos, Mosner, Ladislav, Kakouros, Sofoklis, Plchot, Oldrich, Burget, Lukas, Cernocky, Jan
Self-supervised learning of speech representations from large amounts of unlabeled data has enabled state-of-the-art results in several speech processing tasks. Aggregating these speech representations across time is typically approached by using des
Externí odkaz:
http://arxiv.org/abs/2210.09513
Autor:
Peng, Junyi, Plchot, Oldrich, Stafylakis, Themos, Mosner, Ladislav, Burget, Lukas, Cernocky, Jan
In recent years, self-supervised learning paradigm has received extensive attention due to its great success in various down-stream tasks. However, the fine-tuning strategies for adapting those pre-trained models to speaker verification task have yet
Externí odkaz:
http://arxiv.org/abs/2210.01273