Zobrazeno 1 - 10
of 7 962
pro vyhledávání: '"discriminative training"'
Autor:
Chow, Wei, Li, Juncheng, Yu, Qifan, Pan, Kaihang, Fei, Hao, Ge, Zhiqi, Yang, Shuai, Tang, Siliang, Zhang, Hanwang, Sun, Qianru
In recent times, Vision-Language Models (VLMs) have been trained under two predominant paradigms. Generative training has enabled Multimodal Large Language Models (MLLMs) to tackle various complex tasks, yet issues such as hallucinations and weak obj
Externí odkaz:
http://arxiv.org/abs/2411.00304
This paper introduces a novel training framework called Focused Discriminative Training (FDT) to further improve streaming word-piece end-to-end (E2E) automatic speech recognition (ASR) models trained using either CTC or an interpolation of CTC and a
Externí odkaz:
http://arxiv.org/abs/2408.13008
One-shot voice conversion(VC) aims to change the timbre of any source speech to match that of the target speaker with only one speech sample. Existing style transfer-based VC methods relied on speech representation disentanglement and suffered from a
Externí odkaz:
http://arxiv.org/abs/2409.01668
In this work, we investigate the effect of language models (LMs) with different context lengths and label units (phoneme vs. word) used in sequence discriminative training for phoneme-based neural transducers. Both lattice-free and N-best-list approa
Externí odkaz:
http://arxiv.org/abs/2310.07345
Autor:
Klement, Dominik, Diez, Mireia, Landini, Federico, Burget, Lukáš, Silnova, Anna, Delcroix, Marc, Tawara, Naohiro
Bayesian HMM clustering of x-vector sequences (VBx) has become a widely adopted diarization baseline model in publications and challenges. It uses an HMM to model speaker turns, a generatively trained probabilistic linear discriminant analysis (PLDA)
Externí odkaz:
http://arxiv.org/abs/2310.02732
Internal language model (ILM) subtraction has been widely applied to improve the performance of the RNN-Transducer with external language model (LM) fusion for speech recognition. In this work, we show that sequence discriminative training has a stro
Externí odkaz:
http://arxiv.org/abs/2309.14130
Autor:
Ozbulak, Utku, Lee, Hyun Jung, Boga, Beril, Anzaku, Esla Timothy, Park, Homin, Van Messem, Arnout, De Neve, Wesley, Vankerschaver, Joris
Publikováno v:
Transactions on Machine Learning Research, 2023
Although supervised learning has been highly successful in improving the state-of-the-art in the domain of image-based computer vision in the past, the margin of improvement has diminished significantly in recent years, indicating that a plateau is i
Externí odkaz:
http://arxiv.org/abs/2305.13689
Autor:
Sansone, Emanuele, Manhaeve, Robin
Publikováno v:
ICLR 2023 Workshop NeSy-GeMs
We introduce GEDI, a Bayesian framework that combines existing self-supervised learning objectives with likelihood-based generative models. This framework leverages the benefits of both GEnerative and DIscriminative approaches, resulting in improved
Externí odkaz:
http://arxiv.org/abs/2304.11357
Autor:
Li Xue
Publikováno v:
Applied Mathematics and Nonlinear Sciences, Vol 8, Iss 2, Pp 193-202 (2023)
Aiming at the higher correlation between the objective evaluation of computer English speech and the subjective evaluation of experts, an acoustic model based on discriminative training is proposed to improve the confidence score of objective evaluat
Externí odkaz:
https://doaj.org/article/299123e873db459eb99094298fedd906
Autor:
Sansone, Emanuele, Manhaeve, Robin
Self-supervised learning is a popular and powerful method for utilizing large amounts of unlabeled data, for which a wide variety of training objectives have been proposed in the literature. In this study, we perform a Bayesian analysis of state-of-t
Externí odkaz:
http://arxiv.org/abs/2212.13425