Výsledky vyhledávání - "Krishnaswamy, Arvindh"

Report

Semi-supervised Time Domain Target Speaker Extraction with Attention

Autor: Wang, Zhepei, Giri, Ritwik, Venkataramani, Shrikant, Isik, Umut, Valin, Jean-Marc, Smaragdis, Paris, Goodwin, Mike, Krishnaswamy, Arvindh

In this work, we propose Exformer, a time-domain architecture for target speaker extraction. It consists of a pre-trained speaker embedder network and a separator network based on transformer encoder blocks. We study multiple methods to combine speak

Externí odkaz: http://arxiv.org/abs/2206.09072

Zobrazit plný text záznamu

Report

To Dereverb Or Not to Dereverb? Perceptual Studies On Real-Time Dereverberation Targets

Autor: Valin, Jean-Marc, Giri, Ritwik, Venkataramani, Shrikant, Isik, Umut, Krishnaswamy, Arvindh

In real life, room effect, also known as room reverberation, and the present background noise degrade the quality of speech. Recently, deep learning-based speech enhancement approaches have shown a lot of promise and surpassed traditional denoising a

Externí odkaz: http://arxiv.org/abs/2206.07917

Zobrazit plný text záznamu

Report

Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model

Autor: Valin, Jean-Marc, Mustafa, Ahmed, Montgomery, Christopher, Terriberry, Timothy B., Klingbeil, Michael, Smaragdis, Paris, Krishnaswamy, Arvindh

As deep speech enhancement algorithms have recently demonstrated capabilities greatly surpassing their traditional counterparts for suppressing noise, reverberation and echo, attention is turning to the problem of packet loss concealment (PLC). PLC i

Externí odkaz: http://arxiv.org/abs/2205.05785

Zobrazit plný text záznamu

Report

Improved singing voice separation with chromagram-based pitch-aware remixing

Autor: Yuan, Siyuan, Wang, Zhepei, Isik, Umut, Giri, Ritwik, Valin, Jean-Marc, Goodwin, Michael M., Krishnaswamy, Arvindh

Singing voice separation aims to separate music into vocals and accompaniment components. One of the major constraints for the task is the limited amount of training data with separated vocals. Data augmentation techniques such as random source mixin

Externí odkaz: http://arxiv.org/abs/2203.15092

Zobrazit plný text záznamu

Report

End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation

Autor: Subramani, Krishna, Valin, Jean-Marc, Isik, Umut, Smaragdis, Paris, Krishnaswamy, Arvindh

Neural vocoders have recently demonstrated high quality speech synthesis, but typically require a high computational complexity. LPCNet was proposed as a way to reduce the complexity of neural synthesis by using linear prediction (LP) to assist an au

Externí odkaz: http://arxiv.org/abs/2202.11301

Zobrazit plný text záznamu

Report

Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet

Autor: Valin, Jean-Marc, Isik, Umut, Smaragdis, Paris, Krishnaswamy, Arvindh

Neural speech synthesis models can synthesize high quality speech but typically require a high computational complexity to do so. In previous work, we introduced LPCNet, which uses linear prediction to significantly reduce the complexity of neural sy

Externí odkaz: http://arxiv.org/abs/2202.11169

Zobrazit plný text záznamu

Report

Robust Audio Anomaly Detection

Autor: Lee, Wo Jae, Helwani, Karim, Krishnaswamy, Arvindh, Tenneti, Srikanth

Publikováno v: RobustML Workshop - ICLR 2021

We propose an outlier robust multivariate time series model which can be used for detecting previously unseen anomalous sounds based on noisy training data. The presented approach doesn't assume the presence of labeled anomalies in the training datas

Externí odkaz: http://arxiv.org/abs/2202.01784

Zobrazit plný text záznamu

Report

Personalized PercepNet: Real-time, Low-complexity Target Voice Separation and Enhancement

Autor: Giri, Ritwik, Venkataramani, Shrikant, Valin, Jean-Marc, Isik, Umut, Krishnaswamy, Arvindh

The presence of multiple talkers in the surrounding environment poses a difficult challenge for real-time speech communication systems considering the constraints on network size and complexity. In this paper, we present Personalized PercepNet, a rea

Externí odkaz: http://arxiv.org/abs/2106.04129

Zobrazit plný text záznamu

Report

Semi-Supervised Singing Voice Separation with Noisy Self-Training

Autor: Wang, Zhepei, Giri, Ritwik, Isik, Umut, Valin, Jean-Marc, Krishnaswamy, Arvindh

Recent progress in singing voice separation has primarily focused on supervised deep learning methods. However, the scarcity of ground-truth data with clean musical sources has been a problem for long. Given a limited set of labeled data, we present

Externí odkaz: http://arxiv.org/abs/2102.07961

Zobrazit plný text záznamu

Report

Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders

Autor: Casebeer, Jonah, Vale, Vinjai, Isik, Umut, Valin, Jean-Marc, Giri, Ritwik, Krishnaswamy, Arvindh

Audio codecs based on discretized neural autoencoders have recently been developed and shown to provide significantly higher compression levels for comparable quality speech output. However, these models are tightly coupled with speech content, and p

Externí odkaz: http://arxiv.org/abs/2102.06610

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání