Zobrazeno 1 - 10
of 23
pro vyhledávání: '"Krishnaswamy, Arvindh"'
Autor:
Wang, Zhepei, Giri, Ritwik, Venkataramani, Shrikant, Isik, Umut, Valin, Jean-Marc, Smaragdis, Paris, Goodwin, Mike, Krishnaswamy, Arvindh
In this work, we propose Exformer, a time-domain architecture for target speaker extraction. It consists of a pre-trained speaker embedder network and a separator network based on transformer encoder blocks. We study multiple methods to combine speak
Externí odkaz:
http://arxiv.org/abs/2206.09072
In real life, room effect, also known as room reverberation, and the present background noise degrade the quality of speech. Recently, deep learning-based speech enhancement approaches have shown a lot of promise and surpassed traditional denoising a
Externí odkaz:
http://arxiv.org/abs/2206.07917
Autor:
Valin, Jean-Marc, Mustafa, Ahmed, Montgomery, Christopher, Terriberry, Timothy B., Klingbeil, Michael, Smaragdis, Paris, Krishnaswamy, Arvindh
As deep speech enhancement algorithms have recently demonstrated capabilities greatly surpassing their traditional counterparts for suppressing noise, reverberation and echo, attention is turning to the problem of packet loss concealment (PLC). PLC i
Externí odkaz:
http://arxiv.org/abs/2205.05785
Autor:
Yuan, Siyuan, Wang, Zhepei, Isik, Umut, Giri, Ritwik, Valin, Jean-Marc, Goodwin, Michael M., Krishnaswamy, Arvindh
Singing voice separation aims to separate music into vocals and accompaniment components. One of the major constraints for the task is the limited amount of training data with separated vocals. Data augmentation techniques such as random source mixin
Externí odkaz:
http://arxiv.org/abs/2203.15092
Neural vocoders have recently demonstrated high quality speech synthesis, but typically require a high computational complexity. LPCNet was proposed as a way to reduce the complexity of neural synthesis by using linear prediction (LP) to assist an au
Externí odkaz:
http://arxiv.org/abs/2202.11301
Neural speech synthesis models can synthesize high quality speech but typically require a high computational complexity to do so. In previous work, we introduced LPCNet, which uses linear prediction to significantly reduce the complexity of neural sy
Externí odkaz:
http://arxiv.org/abs/2202.11169
Publikováno v:
RobustML Workshop - ICLR 2021
We propose an outlier robust multivariate time series model which can be used for detecting previously unseen anomalous sounds based on noisy training data. The presented approach doesn't assume the presence of labeled anomalies in the training datas
Externí odkaz:
http://arxiv.org/abs/2202.01784
The presence of multiple talkers in the surrounding environment poses a difficult challenge for real-time speech communication systems considering the constraints on network size and complexity. In this paper, we present Personalized PercepNet, a rea
Externí odkaz:
http://arxiv.org/abs/2106.04129
Recent progress in singing voice separation has primarily focused on supervised deep learning methods. However, the scarcity of ground-truth data with clean musical sources has been a problem for long. Given a limited set of labeled data, we present
Externí odkaz:
http://arxiv.org/abs/2102.07961
Autor:
Casebeer, Jonah, Vale, Vinjai, Isik, Umut, Valin, Jean-Marc, Giri, Ritwik, Krishnaswamy, Arvindh
Audio codecs based on discretized neural autoencoders have recently been developed and shown to provide significantly higher compression levels for comparable quality speech output. However, these models are tightly coupled with speech content, and p
Externí odkaz:
http://arxiv.org/abs/2102.06610