Zobrazeno 1 - 10
of 17 666
pro vyhledávání: '"Selvakumar, A."'
Autor:
Sakshi, S, Tyagi, Utkarsh, Kumar, Sonal, Seth, Ashish, Selvakumar, Ramaneswaran, Nieto, Oriol, Duraiswami, Ramani, Ghosh, Sreyan, Manocha, Dinesh
The ability to comprehend audio--which includes speech, non-speech sounds, and music--is crucial for AI agents to interact effectively with the world. We present MMAU, a novel benchmark designed to evaluate multimodal audio understanding models on ta
Externí odkaz:
http://arxiv.org/abs/2410.19168
Autor:
Selvakumar, Ramaneswaran, Kumar, Sonal, Giri, Hemant Kumar, Anand, Nishit, Seth, Ashish, Ghosh, Sreyan, Manocha, Dinesh
Open-vocabulary audio language models (ALMs), like Contrastive Language Audio Pretraining (CLAP), represent a promising new paradigm for audio-text retrieval using natural language queries. In this paper, for the first time, we perform controlled exp
Externí odkaz:
http://arxiv.org/abs/2410.16505
Audio-Language Models (ALMs) have demonstrated remarkable performance in zero-shot audio classification. In this paper, we introduce PAT (Parameter-free Audio-Text aligner), a simple and training-free method aimed at boosting the zero-shot audio clas
Externí odkaz:
http://arxiv.org/abs/2410.15062
Autor:
Seth, Ashish, Selvakumar, Ramaneswaran, Sakshi, S, Kumar, Sonal, Ghosh, Sreyan, Manocha, Dinesh
In this paper, we present EH-MAM (Easy-to-Hard adaptive Masked Acoustic Modeling), a novel self-supervised learning approach for speech representation learning. In contrast to the prior methods that use random masking schemes for Masked Acoustic Mode
Externí odkaz:
http://arxiv.org/abs/2410.13179
Publikováno v:
International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249-8958 (Online), Volume-9 Issue-1S3, December 2019
The machine vision systems have been playing a significant role in visual monitoring systems. With the help of stereovision and machine learning, it will be able to mimic human-like visual system and behaviour towards the environment. In this paper,
Externí odkaz:
http://arxiv.org/abs/2406.19498
Autor:
Wu, Songyin, Vembar, Deepak, Sochenov, Anton, Panneer, Selvakumar, Kim, Sungye, Kaplanyan, Anton, Yan, Ling-Qi
Real-time rendering has been embracing ever-demanding effects, such as ray tracing. However, rendering such effects in high resolution and high frame rate remains challenging. Frame extrapolation methods, which don't introduce additional latency as o
Externí odkaz:
http://arxiv.org/abs/2406.18551
We propose an indirect inference strategy for estimating heterogeneous-agent business cycle models with micro data. At its heart is a first-order vector autoregression that is grounded in linear filtering theory as the cross-section grows large. The
Externí odkaz:
http://arxiv.org/abs/2402.11379
Autor:
I., Adumbabu, Selvakumar, K.
Publikováno v:
International Journal of Pervasive Computing and Communications, 2022, Vol. 20, Issue 4, pp. 496-509.
Externí odkaz:
http://www.emeraldinsight.com/doi/10.1108/IJPCC-02-2022-0045
Autor:
Basu, Judhajeet, Pavana, M., Anupama, G. C., Barway, Sudhanshu, Singh, Kulinder Pal, Swain, Vishwajeet, Srivastav, Shubham, Kumar, Harsh, Bhalero, Varun, Sonith, L. S., Selvakumar, G.
We report the optical, UV, and soft X-ray observations of the $2017-2022$ eruptions of the recurrent nova M31N 2008-12a. We infer a steady decrease in the accretion rate over the years based on the inter-eruption recurrence period. We find a ``cusp''
Externí odkaz:
http://arxiv.org/abs/2310.06586
Autor:
Selvakumar, Anith, Fashandi, Homa
Publikováno v:
Proc. Interspeech 2024, 4728-4732
Distance Metric Learning (DML) has typically dominated the audio-visual speaker verification problem space, owing to strong performance in new and unseen classes. In our work, we explored multitask learning techniques to further enhance DML, and show
Externí odkaz:
http://arxiv.org/abs/2309.07115