Zobrazeno 1 - 10
of 425
pro vyhledávání: '"Sadeghi Mostafa"'
Distant-microphone meeting transcription is a challenging task. State-of-the-art end-to-end speaker-attributed automatic speech recognition (SA-ASR) architectures lack a multichannel noise and reverberation reduction front-end, which limits their per
Externí odkaz:
http://arxiv.org/abs/2410.21849
This paper proposes a new unsupervised audiovisual speech enhancement (AVSE) approach that combines a diffusion-based audio-visual speech generative model with a non-negative matrix factorization (NMF) noise model. First, the diffusion model is pre-t
Externí odkaz:
http://arxiv.org/abs/2410.05301
Publikováno v:
پژوهشهای سلامتمحور, Vol 6, Iss 1, Pp 39-49 (2020)
The Relationship between Health Literacy and Organizational Culture among the Staff of Rafsanjan University of Medical Sciences Sadeghi Mostafa1, Shakiba Elham2*, Naderi Monavar3 1. Professor, Department of Restorative Dentistry, Faculty of Dentist
Externí odkaz:
https://doaj.org/article/e91ea606a61d4e92be8089f9968004a4
Publikováno v:
The Speaker and Language Recognition Workshop Odyssey 2024, Jun 2024, Quebec, Canada
Past studies on end-to-end meeting transcription have focused on model architecture and have mostly been evaluated on simulated meeting data. We present a novel study aiming to optimize the use of a Speaker-Attributed ASR (SA-ASR) system in real-life
Externí odkaz:
http://arxiv.org/abs/2403.06570
Autor:
Leglaive, Simon, Fraticelli, Matthieu, ElGhazaly, Hend, Borne, Léonie, Sadeghi, Mostafa, Wisdom, Scott, Pariente, Manuel, Hershey, John R., Pressnitzer, Daniel, Barker, Jon P.
Supervised models for speech enhancement are trained using artificially generated mixtures of clean speech and noise signals. However, the synthetic training conditions may not accurately reflect real-world conditions encountered during testing. This
Externí odkaz:
http://arxiv.org/abs/2402.01413
Joint punctuated and normalized automatic speech recognition (ASR), that outputs transcripts with and without punctuation and casing, remains challenging due to the lack of paired speech and punctuated text data in most ASR corpora. We propose two ap
Externí odkaz:
http://arxiv.org/abs/2311.17741
We present an end-to-end multichannel speaker-attributed automatic speech recognition (MC-SA-ASR) system that combines a Conformer-based encoder with multi-frame crosschannel attention and a speaker-attributed Transformer-based decoder. To the best o
Externí odkaz:
http://arxiv.org/abs/2310.10106
Autor:
Sadeghi, Mostafa, Serizel, Romain
In this paper, we address the unsupervised speech enhancement problem based on recurrent variational autoencoder (RVAE). This approach offers promising generalization performance over the supervised counterpart. Nevertheless, the involved iterative v
Externí odkaz:
http://arxiv.org/abs/2309.10439
Diffusion-based generative models have recently gained attention in speech enhancement (SE), providing an alternative to conventional supervised methods. These models transform clean speech training samples into Gaussian noise centered at noisy speec
Externí odkaz:
http://arxiv.org/abs/2309.10457
Recently, conditional score-based diffusion models have gained significant attention in the field of supervised speech enhancement, yielding state-of-the-art performance. However, these methods may face challenges when generalising to unseen conditio
Externí odkaz:
http://arxiv.org/abs/2309.10450