Zobrazeno 1 - 10
of 92
pro vyhledávání: '"Gupta, Anirudh"'
Autor:
Varadhan, Praveen Srinivasa, Gulati, Amogh, Sankar, Ashwin, Anand, Srija, Gupta, Anirudh, Mukherjee, Anirudh, Marepally, Shiva Kumar, Bhatia, Ankur, Jaju, Saloni, Bhooshan, Suvrat, Khapra, Mitesh M.
Despite rapid advancements in TTS models, a consistent and robust human evaluation framework is still lacking. For example, MOS tests fail to differentiate between similar models, and CMOS's pairwise comparisons are time-intensive. The MUSHRA test is
Externí odkaz:
http://arxiv.org/abs/2411.12719
Autor:
Chhimwal, Neeraj, Gupta, Anirudh, Gaur, Rishabh, Chadha, Harveen Singh, Shah, Priyanshi, Dhuriya, Ankur, Raghavan, Vivek
In this paper, we propose a pipeline to find the number of speakers, as well as audios belonging to each of these now identified speakers in a source of audio data where number of speakers or speaker labels are not known a priori. We used this approa
Externí odkaz:
http://arxiv.org/abs/2205.02475
Autor:
Gupta, Anirudh, Chhimwal, Neeraj, Dhuriya, Ankur, Gaur, Rishabh, Shah, Priyanshi, Chadha, Harveen Singh, Raghavan, Vivek
Automatic Speech Recognition (ASR) generates text which is most of the times devoid of any punctuation. Absence of punctuation is text can affect readability. Also, down stream NLP tasks such as sentiment analysis, machine translation, greatly benefi
Externí odkaz:
http://arxiv.org/abs/2203.16825
Autor:
Gupta, Anirudh, Gaur, Rishabh, Dhuriya, Ankur, Chadha, Harveen Singh, Chhimwal, Neeraj, Shah, Priyanshi, Raghavan, Vivek
In the recent years end to end (E2E) automatic speech recognition (ASR) systems have achieved promising results given sufficient resources. Even for languages where not a lot of labelled data is available, state of the art E2E ASR systems can be deve
Externí odkaz:
http://arxiv.org/abs/2203.16823
Autor:
Shah, Priyanshi, Chadha, Harveen Singh, Gupta, Anirudh, Dhuriya, Ankur, Chhimwal, Neeraj, Gaur, Rishabh, Raghavan, Vivek
We propose a new method for the calculation of error rates in Automatic Speech Recognition (ASR). This new metric is for languages that contain half characters and where the same character can be written in different forms. We implement our methodolo
Externí odkaz:
http://arxiv.org/abs/2203.16601
Autor:
Dhuriya, Ankur, Chadha, Harveen Singh, Gupta, Anirudh, Shah, Priyanshi, Chhimwal, Neeraj, Gaur, Rishabh, Raghavan, Vivek
We study the effect of applying a language model (LM) on the output of Automatic Speech Recognition (ASR) systems for Indic languages. We fine-tune wav2vec $2.0$ models for $18$ Indic languages and adjust the results with language models trained on t
Externí odkaz:
http://arxiv.org/abs/2203.16595
Autor:
Chadha, Harveen Singh, Shah, Priyanshi, Dhuriya, Ankur, Chhimwal, Neeraj, Gupta, Anirudh, Raghavan, Vivek
Training multilingual automatic speech recognition (ASR) systems is challenging because acoustic and lexical information is typically language specific. Training multilingual system for Indic languages is even more tougher due to lack of open source
Externí odkaz:
http://arxiv.org/abs/2203.16578
Autor:
Chadha, Harveen Singh, Gupta, Anirudh, Shah, Priyanshi, Chhimwal, Neeraj, Dhuriya, Ankur, Gaur, Rishabh, Raghavan, Vivek
We present Vakyansh, an end to end toolkit for Speech Recognition in Indic languages. India is home to almost 121 languages and around 125 crore speakers. Yet most of the languages are low resource in terms of data and pretrained models. Through Vaky
Externí odkaz:
http://arxiv.org/abs/2203.16512
Autor:
M. A. Altalbawy, Farag, Abdul-Reda Hussein, Uday, krishna saraswat, Shelesh, Baldaniya, Lalji, Rekha, M.M., Guntaj, J., Gupta, Anirudh, Sabah Ghnim, Zahraa, Fawzi Al-Hussainy, Ali, jasim al-shuwaili, Saeb, Faez Sead, Fadhil
Publikováno v:
In Computational and Theoretical Chemistry January 2025 1243
Autor:
Gupta, Anirudh, Chadha, Harveen Singh, Shah, Priyanshi, Chhimwal, Neeraj, Dhuriya, Ankur, Gaur, Rishabh, Raghavan, Vivek
We present a CLSRIL-23, a self supervised learning based audio pre-trained model which learns cross lingual speech representations from raw audio across 23 Indic languages. It is built on top of wav2vec 2.0 which is solved by training a contrastive t
Externí odkaz:
http://arxiv.org/abs/2107.07402