Zobrazeno 1 - 10
of 125
pro vyhledávání: '"Demuynck, Kris"'
Spoken term detection (STD) is often hindered by reliance on frame-level features and the computationally intensive DTW-based template matching, limiting its practicality. To address these challenges, we propose a novel approach that encodes speech i
Externí odkaz:
http://arxiv.org/abs/2411.14100
Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization
Autor:
Thienpondt, Jenthe, Demuynck, Kris
Current speaker diarization systems rely on an external voice activity detection model prior to speaker embedding extraction on the detected speech segments. In this paper, we establish that the attention system of a speaker embedding extractor acts
Externí odkaz:
http://arxiv.org/abs/2405.09142
Training monolingual language models for low and mid-resource languages is made challenging by limited and often inadequate pretraining data. In this study, we propose a novel model conversion strategy to address this issue, adapting high-resources m
Externí odkaz:
http://arxiv.org/abs/2310.03477
In this paper, we analyze the behavior of speaker embeddings of patients during oral cancer treatment. First, we found that pre- and post-treatment speaker embeddings differ significantly, notifying a substantial change in voice characteristics. Howe
Externí odkaz:
http://arxiv.org/abs/2307.04744
This paper is concerned with the task of speaker verification on audio with multiple overlapping speakers. Most speaker verification systems are designed with the assumption of a single speaker being present in a given audio segment. However, in a re
Externí odkaz:
http://arxiv.org/abs/2304.03515
Audio fingerprinting systems must efficiently and robustly identify query snippets in an extensive database. To this end, state-of-the-art systems use deep learning to generate compact audio fingerprints. These systems deploy indexing methods, which
Externí odkaz:
http://arxiv.org/abs/2211.11060
This work introduces BioLORD, a new pre-training strategy for producing meaningful representations for clinical sentences and biomedical concepts. State-of-the-art methodologies operate by maximizing the similarity in representation of names referrin
Externí odkaz:
http://arxiv.org/abs/2210.11892
An ideal audio retrieval system efficiently and robustly recognizes a short query snippet from an extensive database. However, the performance of well-known audio fingerprinting systems falls short at high signal distortion levels. This paper present
Externí odkaz:
http://arxiv.org/abs/2210.08624
Autor:
Thienpondt, Jenthe, Demuynck, Kris
Automatic Speech Recognition (ASR) systems are known to exhibit difficulties when transcribing children's speech. This can mainly be attributed to the absence of large children's speech corpora to train robust ASR models and the resulting domain mism
Externí odkaz:
http://arxiv.org/abs/2206.09396
This paper contains a post-challenge performance analysis on cross-lingual speaker verification of the IDLab submission to the VoxCeleb Speaker Recognition Challenge 2021 (VoxSRC-21). We show that current speaker embedding extractors consistently und
Externí odkaz:
http://arxiv.org/abs/2110.09150