Zobrazeno 1 - 10
of 26
pro vyhledávání: '"NEEKHARA, PAARTH"'
Autor:
Casanova, Edresson, Langman, Ryan, Neekhara, Paarth, Hussain, Shehzeen, Li, Jason, Ghosh, Subhankar, Jukić, Ante, Lee, Sang-gil
Large language models (LLMs) have significantly advanced audio processing through audio codecs that convert audio into discrete tokens, enabling the application of language modeling techniques to audio data. However, audio codecs often operate at hig
Externí odkaz:
http://arxiv.org/abs/2409.12117
Autor:
Neekhara, Paarth, Hussain, Shehzeen, Ghosh, Subhankar, Li, Jason, Valle, Rafael, Badlani, Rohan, Ginsburg, Boris
Large Language Model (LLM) based text-to-speech (TTS) systems have demonstrated remarkable capabilities in handling large speech datasets and generating natural speech for new speakers. However, LLM-based TTS models are not robust as the generated ou
Externí odkaz:
http://arxiv.org/abs/2406.17957
We present REMARK-LLM, a novel efficient, and robust watermarking framework designed for texts generated by large language models (LLMs). Synthesizing human-like content using LLMs necessitates vast computational resources and extensive datasets, enc
Externí odkaz:
http://arxiv.org/abs/2310.12362
Autor:
Neekhara, Paarth, Hussain, Shehzeen, Valle, Rafael, Ginsburg, Boris, Ranjan, Rishabh, Dubnov, Shlomo, Koushanfar, Farinaz, McAuley, Julian
We propose SelfVC, a training strategy to iteratively improve a voice conversion model with self-synthesized examples. Previous efforts on voice conversion focus on factorizing speech into explicitly disentangled representations that separately encod
Externí odkaz:
http://arxiv.org/abs/2310.09653
In this work, we propose a zero-shot voice conversion method using speech representations trained with self-supervised learning. First, we develop a multi-task model to decompose a speech utterance into features such as linguistic content, speaker ch
Externí odkaz:
http://arxiv.org/abs/2302.08137
Autor:
Hussain, Shehzeen, Sheybani, Nojan, Neekhara, Paarth, Zhang, Xinqiao, Duarte, Javier, Koushanfar, Farinaz
Steganography and digital watermarking are the tasks of hiding recoverable data in image pixels. Deep neural network (DNN) based image steganography and watermarking techniques are quickly replacing traditional hand-engineered pipelines. DNN based wa
Externí odkaz:
http://arxiv.org/abs/2209.12391
Autor:
Hussain, Shehzeen, Huster, Todd, Mesterharm, Chris, Neekhara, Paarth, An, Kevin, Jere, Malhar, Sikka, Harshvardhan, Koushanfar, Farinaz
Deep neural network based face recognition models have been shown to be vulnerable to adversarial examples. However, many of the past attacks require the adversary to solve an input-dependent optimization problem using gradient descent which makes th
Externí odkaz:
http://arxiv.org/abs/2206.04783
Autor:
Neekhara, Paarth, Hussain, Shehzeen, Zhang, Xinqiao, Huang, Ke, McAuley, Julian, Koushanfar, Farinaz
Deepfakes and manipulated media are becoming a prominent threat due to the recent advances in realistic image and video synthesis techniques. There have been several attempts at combating Deepfakes using machine learning classifiers. However, such cl
Externí odkaz:
http://arxiv.org/abs/2204.01960
Training neural text-to-speech (TTS) models for a new speaker typically requires several hours of high quality speech data. Prior works on voice cloning attempt to address this challenge by adapting pre-trained multi-speaker TTS models for a new voic
Externí odkaz:
http://arxiv.org/abs/2110.05798
There has been a recent surge in adversarial attacks on deep learning based automatic speech recognition (ASR) systems. These attacks pose new challenges to deep learning security and have raised significant concerns in deploying ASR systems in safet
Externí odkaz:
http://arxiv.org/abs/2103.03344