Zobrazeno 1 - 10
of 1 667
pro vyhledávání: '"P. Habets"'
The successful deployment of deep learning-based acoustic echo and noise reduction (AENR) methods in consumer devices has spurred interest in developing low-complexity solutions, while emphasizing the need for robust performance in real-life applicat
Externí odkaz:
http://arxiv.org/abs/2410.13620
Enhancing speech quality under adverse SNR conditions remains a significant challenge for discriminative deep neural network (DNN)-based approaches. In this work, we propose DisCoGAN, which is a time-frequency-domain generative adversarial network (G
Externí odkaz:
http://arxiv.org/abs/2410.13599
Autor:
Wechsler, Julian, Chetupalli, Srikanth Raj, Halimeh, Mhd Modar, Thiergart, Oliver, Habets, Emanuël A. P.
Capturing audio signals with specific directivity patterns is essential in speech communication. This study presents a deep neural network (DNN)-based approach to directional filtering, alleviating the need for explicit signal models. More specifical
Externí odkaz:
http://arxiv.org/abs/2409.13502
Autor:
Shetu, Shrishti Saha, Desiraju, Naveen Kumar, Aponte, Jose Miguel Martinez, Habets, Emanuël A. P., Mabande, Edwin
Deep learning-based methods that jointly perform the task of acoustic echo and noise reduction (AENR) often require high memory and computational resources, making them unsuitable for real-time deployment on low-resource platforms such as embedded de
Externí odkaz:
http://arxiv.org/abs/2408.15746
In this study, we conduct a comparative analysis of deep learning-based noise reduction methods in low signal-to-noise ratio (SNR) scenarios. Our investigation primarily focuses on five key aspects: The impact of training data, the influence of vario
Externí odkaz:
http://arxiv.org/abs/2408.14582
Dialogue separation involves isolating a dialogue signal from a mixture, such as a movie or a TV program. This can be a necessary step to enable dialogue enhancement for broadcast-related applications. In this paper, ConcateNet for dialogue separatio
Externí odkaz:
http://arxiv.org/abs/2408.08729
We present a method for blind acoustic parameter estimation from single-channel reverberant speech. The method is structured into three stages. In the first stage, a variational auto-encoder is trained to extract latent representations of acoustic im
Externí odkaz:
http://arxiv.org/abs/2407.19989
Autor:
Lux, Florian, Meyer, Sarina, Behringer, Lyonel, Zalkow, Frank, Do, Phat, Coler, Matt, Habets, Emanuël A. P., Vu, Ngoc Thang
In this work, we take on the challenging task of building a single text-to-speech synthesis system that is capable of generating speech in over 7000 languages, many of which lack sufficient data for traditional TTS development. By leveraging a novel
Externí odkaz:
http://arxiv.org/abs/2406.06403
Autor:
Torcoli, Matteo, Halimeh, Mhd Modar, Leitz, Thomas, Grewe, Yannik, Kratschmer, Michael, Neugebauer, Bernhard, Murtaza, Adrian, Fuchs, Harald, Habets, Emanuël A. P.
The introduction and regulation of loudness in broadcasting and streaming brought clear benefits to the audience, e.g., a level of uniformity across programs and channels. Yet, speech loudness is frequently reported as being too low in certain passag
Externí odkaz:
http://arxiv.org/abs/2405.17364
Room geometry inference algorithms rely on the localization of acoustic reflectors to identify boundary surfaces of an enclosure. Rooms with highly absorptive walls or walls at large distances from the measurement setup pose challenges for such algor
Externí odkaz:
http://arxiv.org/abs/2402.06246