Výsledky vyhledávání - "SELVAKUMAR, A."

Report

MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark

Autor: Sakshi, S, Tyagi, Utkarsh, Kumar, Sonal, Seth, Ashish, Selvakumar, Ramaneswaran, Nieto, Oriol, Duraiswami, Ramani, Ghosh, Sreyan, Manocha, Dinesh

The ability to comprehend audio--which includes speech, non-speech sounds, and music--is crucial for AI agents to interact effectively with the world. We present MMAU, a novel benchmark designed to evaluate multimodal audio understanding models on ta

Externí odkaz: http://arxiv.org/abs/2410.19168

Zobrazit plný text záznamu

Report

Do Audio-Language Models Understand Linguistic Variations?

Autor: Selvakumar, Ramaneswaran, Kumar, Sonal, Giri, Hemant Kumar, Anand, Nishit, Seth, Ashish, Ghosh, Sreyan, Manocha, Dinesh

Open-vocabulary audio language models (ALMs), like Contrastive Language Audio Pretraining (CLAP), represent a promising new paradigm for audio-text retrieval using natural language queries. In this paper, for the first time, we perform controlled exp

Externí odkaz: http://arxiv.org/abs/2410.16505

Zobrazit plný text záznamu

Report

PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification

Autor: Seth, Ashish, Selvakumar, Ramaneswaran, Kumar, Sonal, Ghosh, Sreyan, Manocha, Dinesh

Audio-Language Models (ALMs) have demonstrated remarkable performance in zero-shot audio classification. In this paper, we introduce PAT (Parameter-free Audio-Text aligner), a simple and training-free method aimed at boosting the zero-shot audio clas

Externí odkaz: http://arxiv.org/abs/2410.15062

Zobrazit plný text záznamu

Report

EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning

Autor: Seth, Ashish, Selvakumar, Ramaneswaran, Sakshi, S, Kumar, Sonal, Ghosh, Sreyan, Manocha, Dinesh

In this paper, we present EH-MAM (Easy-to-Hard adaptive Masked Acoustic Modeling), a novel self-supervised learning approach for speech representation learning. In contrast to the prior methods that use random masking schemes for Masked Acoustic Mode

Externí odkaz: http://arxiv.org/abs/2410.13179

Zobrazit plný text záznamu

Report

Stereo Vision Based Robot for Remote Monitoring with VR Support

Autor: S., Mohamed Fazil M., A., Arockia Selvakumar, Schilberg, Daniel

Publikováno v: International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249-8958 (Online), Volume-9 Issue-1S3, December 2019

The machine vision systems have been playing a significant role in visual monitoring systems. With the help of stereovision and machine learning, it will be able to mimic human-like visual system and behaviour towards the environment. In this paper,

Externí odkaz: http://arxiv.org/abs/2406.19498

Zobrazit plný text záznamu

Report

GFFE: G-buffer Free Frame Extrapolation for Low-latency Real-time Rendering

Autor: Wu, Songyin, Vembar, Deepak, Sochenov, Anton, Panneer, Selvakumar, Kim, Sungye, Kaplanyan, Anton, Yan, Ling-Qi

Real-time rendering has been embracing ever-demanding effects, such as ray tracing. However, rendering such effects in high resolution and high frame rate remains challenging. Frame extrapolation methods, which don't introduce additional latency as o

Externí odkaz: http://arxiv.org/abs/2406.18551

Zobrazit plný text záznamu

Report

Estimating HANK with Micro Data

Autor: Iao, Man Chon, Selvakumar, Yatheesan J.

We propose an indirect inference strategy for estimating heterogeneous-agent business cycle models with micro data. At its heart is a first-order vector autoregression that is grounded in linear filtering theory as the cross-section grows large. The

Externí odkaz: http://arxiv.org/abs/2402.11379

Zobrazit plný text záznamu

Akademický článek

Hybrid cumulative approach for localization of nodes with adaptive threshold gradient feature on energy minimization using federated learning

Autor: I., Adumbabu, Selvakumar, K.

Publikováno v: International Journal of Pervasive Computing and Communications, 2022, Vol. 20, Issue 4, pp. 496-509.

Externí odkaz: http://www.emeraldinsight.com/doi/10.1108/IJPCC-02-2022-0045

Zobrazit plný text záznamu

Report

Multi-wavelength observations of multiple eruptions of the recurrent nova M31N 2008-12a

Autor: Basu, Judhajeet, Pavana, M., Anupama, G. C., Barway, Sudhanshu, Singh, Kulinder Pal, Swain, Vishwajeet, Srivastav, Shubham, Kumar, Harsh, Bhalero, Varun, Sonith, L. S., Selvakumar, G.

We report the optical, UV, and soft X-ray observations of the $2017-2022$ eruptions of the recurrent nova M31N 2008-12a. We infer a steady decrease in the accretion rate over the years based on the inter-eruption recurrence period. We find a ``cusp''

Externí odkaz: http://arxiv.org/abs/2310.06586

Zobrazit plný text záznamu

Report

Getting More for Less: Using Weak Labels and AV-Mixup for Robust Audio-Visual Speaker Verification

Autor: Selvakumar, Anith, Fashandi, Homa

Publikováno v: Proc. Interspeech 2024, 4728-4732

Distance Metric Learning (DML) has typically dominated the audio-visual speaker verification problem space, owing to strong performance in new and unseen classes. In our work, we explored multitask learning techniques to further enhance DML, and show

Externí odkaz: http://arxiv.org/abs/2309.07115

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání