Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Stamenovic, Marko"'
Target Sound Extraction (TSE) focuses on the problem of separating sources of interest, indicated by a user's cue, from the input mixture. Most existing solutions operate in an offline fashion and are not suited to the low-latency causal processing c
Externí odkaz:
http://arxiv.org/abs/2403.14246
Autor:
Karchkhadze, Tornike, Kavaki, Hassan Salami, Izadi, Mohammad Rasool, Irvin, Bryce, Kegler, Mikolaj, Hertz, Ari, Zhang, Shuo, Stamenovic, Marko
Publikováno v:
EUSIPCO 2024 Proceedings, ISBN: 978-9-4645-9361-7
Foley sound generation, the art of creating audio for multimedia, has recently seen notable advancements through text-conditioned latent diffusion models. These systems use multimodal text-audio representation models, such as Contrastive Language-Aud
Externí odkaz:
http://arxiv.org/abs/2403.12182
Tiny, causal models are crucial for embedded audio machine learning applications. Model compression can be achieved via distilling knowledge from a large teacher into a smaller student model. In this work, we propose a novel two-step approach for tin
Externí odkaz:
http://arxiv.org/abs/2309.08144
CCATMos: Convolutional Context-aware Transformer Network for Non-intrusive Speech Quality Assessment
Speech quality assessment has been a critical component in many voice communication related applications such as telephony and online conferencing. Traditional intrusive speech quality assessment requires the clean reference of the degraded utterance
Externí odkaz:
http://arxiv.org/abs/2211.02577
Modern speech enhancement (SE) networks typically implement noise suppression through time-frequency masking, latent representation masking, or discriminative signal prediction. In contrast, some recent works explore SE via generative speech synthesi
Externí odkaz:
http://arxiv.org/abs/2211.02542
We explore network sparsification strategies with the aim of compressing neural speech enhancement (SE) down to an optimal configuration for a new generation of low power microcontroller based neural accelerators (microNPU's). We examine three unique
Externí odkaz:
http://arxiv.org/abs/2111.02351
Autor:
Stamenovic, Marko, Luo, Jeibo
Publikováno v:
2017 IEEE Third International Conference on Multimedia Big Data (BigMM)
The volume of academic paper submissions and publications is growing at an ever increasing rate. While this flood of research promises progress in various fields, the sheer volume of output inherently increases the amount of noise. We present a syste
Externí odkaz:
http://arxiv.org/abs/2005.10321
Autor:
Stamenovic, Marko
Publikováno v:
Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, PMLR 80, 2018
A cover song, by definition, is a new performance or recording of a previously recorded, commercially released song. It may be by the original artist themselves or a different artist altogether and can vary from the original in unpredictable ways inc
Externí odkaz:
http://arxiv.org/abs/2005.10294
Autor:
Fedorov, Igor, Stamenovic, Marko, Jensen, Carl, Yang, Li-Chia, Mandell, Ari, Gan, Yiming, Mattina, Matthew, Whatmough, Paul N.
Modern speech enhancement algorithms achieve remarkable noise suppression by means of large recurrent neural networks (RNNs). However, large RNNs limit practical deployment in hearing aid hardware (HW) form-factors, which are battery powered and run
Externí odkaz:
http://arxiv.org/abs/2005.11138