Zobrazeno 1 - 10
of 2 960
pro vyhledávání: '"Sharma, Rajesh"'
Autor:
Phukan, Orchid Chetia, Girish, Akhtar, Mohd Mujtaba, Behera, Swarup Ranjan, Choudhury, Nitin, Buduru, Arun Balaji, Sharma, Rajesh, Prasanna, S. R Mahadeva
The adaptation of foundation models has significantly advanced environmental audio deepfake detection (EADD), a rapidly growing area of research. These models are typically fine-tuned or utilized in their frozen states for downstream tasks. However,
Externí odkaz:
http://arxiv.org/abs/2409.15767
Autor:
Phukan, Orchid Chetia, Behera, Swarup Ranjan, Singh, Shubham, Singh, Muskaan, Rajan, Vandana, Buduru, Arun Balaji, Sharma, Rajesh, Prasanna, S. R. Mahadeva
In this study, we address the challenge of depression detection from speech, focusing on the potential of non-semantic features (NSFs) to capture subtle markers of depression. While prior research has leveraged various features for this task, NSFs-ex
Externí odkaz:
http://arxiv.org/abs/2409.14312
Autor:
Phukan, Orchid Chetia, Akhtar, Mohd Mujtaba, Girish, Behera, Swarup Ranjan, Kalita, Sishir, Buduru, Arun Balaji, Sharma, Rajesh, Prasanna, S. R Mahadeva
In this study, we investigate multimodal foundation models (MFMs) for emotion recognition from non-verbal sounds. We hypothesize that MFMs, with their joint pre-training across multiple modalities, will be more effective in non-verbal sounds emotion
Externí odkaz:
http://arxiv.org/abs/2409.14221
Autor:
Phukan, Orchid Chetia, Jain, Sarthak, Behera, Swarup Ranjan, Buduru, Arun Balaji, Sharma, Rajesh, Prasanna, S. R Mahadeva
In this study, for the first time, we extensively investigate whether music foundation models (MFMs) or speech foundation models (SFMs) work better for singing voice deepfake detection (SVDD), which has recently attracted attention in the research co
Externí odkaz:
http://arxiv.org/abs/2409.14131
Analyzing user reviews for sentiment towards app features can provide valuable insights into users' perceptions of app functionality and their evolving needs. Given the volume of user reviews received daily, an automated mechanism to generate feature
Externí odkaz:
http://arxiv.org/abs/2409.07162
In recent years, the proliferation of misinformation on social media platforms has become a significant concern. Initially designed for sharing information and fostering social connections, platforms like Twitter (now rebranded as X) have also unfort
Externí odkaz:
http://arxiv.org/abs/2406.12444
In this work, we present, AVR application for audio-visual humor detection. While humor detection has traditionally centered around textual analysis, recent advancements have spotlighted multimodal approaches. However, these methods lean on textual c
Externí odkaz:
http://arxiv.org/abs/2406.10448
Autor:
Phukan, Orchid Chetia, Mallick, Priyabrata, Behera, Swarup Ranjan, Narayani, Aalekhya Satya, Buduru, Arun Balaji, Sharma, Rajesh
In this paper, we work towards extending Audio-Visual Question Answering (AVQA) to multilingual settings. Existing AVQA research has predominantly revolved around English and replicating it for addressing AVQA in other languages requires a substantia
Externí odkaz:
http://arxiv.org/abs/2406.09156
In this paper, we focus on audio violence detection (AVD). AVD is necessary for several reasons, especially in the context of maintaining safety, preventing harm, and ensuring security in various environments. This calls for accurate AVD systems. Lik
Externí odkaz:
http://arxiv.org/abs/2406.06798
Emotion Recognition (ER), Gender Recognition (GR), and Age Estimation (AE) constitute paralinguistic tasks that rely not on the spoken content but primarily on speech characteristics such as pitch and tone. While previous research has made significan
Externí odkaz:
http://arxiv.org/abs/2406.06781