Zobrazeno 1 - 10
of 47
pro vyhledávání: '"Raghavan, Vivek"'
Autor:
Gala, Jay, Chitale, Pranjal A., AK, Raghavan, Gumma, Varun, Doddapaneni, Sumanth, Kumar, Aswanth, Nawale, Janki, Sujatha, Anupama, Puduppully, Ratish, Raghavan, Vivek, Kumar, Pratyush, Khapra, Mitesh M., Dabre, Raj, Kunchukuttan, Anoop
India has a rich linguistic landscape with languages from 4 major language families spoken by over a billion people. 22 of these languages are listed in the Constitution of India (referred to as scheduled languages) are the focus of this work. Given
Externí odkaz:
http://arxiv.org/abs/2305.16307
Autor:
Modi, Ashutosh, Kalamkar, Prathamesh, Karn, Saurabh, Tiwari, Aman, Joshi, Abhinav, Tanikella, Sai Kiran, Guha, Shouvik Kumar, Malhan, Sachin, Raghavan, Vivek
In populous countries, pending legal cases have been growing exponentially. There is a need for developing NLP-based techniques for processing and automatically understanding legal documents. To promote research in the area of Legal NLP we organized
Externí odkaz:
http://arxiv.org/abs/2304.09548
Autor:
Kalamkar, Prathamesh, Agarwal, Astha, Tiwari, Aman, Gupta, Smita, Karn, Saurabh, Raghavan, Vivek
Identification of named entities from legal texts is an essential building block for developing other legal Artificial Intelligence applications. Named Entities in legal texts are slightly different and more fine-grained than commonly used named enti
Externí odkaz:
http://arxiv.org/abs/2211.03442
Autor:
Chhimwal, Neeraj, Gupta, Anirudh, Gaur, Rishabh, Chadha, Harveen Singh, Shah, Priyanshi, Dhuriya, Ankur, Raghavan, Vivek
In this paper, we propose a pipeline to find the number of speakers, as well as audios belonging to each of these now identified speakers in a source of audio data where number of speakers or speaker labels are not known a priori. We used this approa
Externí odkaz:
http://arxiv.org/abs/2205.02475
Autor:
Gupta, Anirudh, Chhimwal, Neeraj, Dhuriya, Ankur, Gaur, Rishabh, Shah, Priyanshi, Chadha, Harveen Singh, Raghavan, Vivek
Automatic Speech Recognition (ASR) generates text which is most of the times devoid of any punctuation. Absence of punctuation is text can affect readability. Also, down stream NLP tasks such as sentiment analysis, machine translation, greatly benefi
Externí odkaz:
http://arxiv.org/abs/2203.16825
Autor:
Gupta, Anirudh, Gaur, Rishabh, Dhuriya, Ankur, Chadha, Harveen Singh, Chhimwal, Neeraj, Shah, Priyanshi, Raghavan, Vivek
In the recent years end to end (E2E) automatic speech recognition (ASR) systems have achieved promising results given sufficient resources. Even for languages where not a lot of labelled data is available, state of the art E2E ASR systems can be deve
Externí odkaz:
http://arxiv.org/abs/2203.16823
Autor:
Shah, Priyanshi, Chadha, Harveen Singh, Gupta, Anirudh, Dhuriya, Ankur, Chhimwal, Neeraj, Gaur, Rishabh, Raghavan, Vivek
We propose a new method for the calculation of error rates in Automatic Speech Recognition (ASR). This new metric is for languages that contain half characters and where the same character can be written in different forms. We implement our methodolo
Externí odkaz:
http://arxiv.org/abs/2203.16601
Autor:
Dhuriya, Ankur, Chadha, Harveen Singh, Gupta, Anirudh, Shah, Priyanshi, Chhimwal, Neeraj, Gaur, Rishabh, Raghavan, Vivek
We study the effect of applying a language model (LM) on the output of Automatic Speech Recognition (ASR) systems for Indic languages. We fine-tune wav2vec $2.0$ models for $18$ Indic languages and adjust the results with language models trained on t
Externí odkaz:
http://arxiv.org/abs/2203.16595
Autor:
Chadha, Harveen Singh, Shah, Priyanshi, Dhuriya, Ankur, Chhimwal, Neeraj, Gupta, Anirudh, Raghavan, Vivek
Training multilingual automatic speech recognition (ASR) systems is challenging because acoustic and lexical information is typically language specific. Training multilingual system for Indic languages is even more tougher due to lack of open source
Externí odkaz:
http://arxiv.org/abs/2203.16578
Autor:
Chadha, Harveen Singh, Gupta, Anirudh, Shah, Priyanshi, Chhimwal, Neeraj, Dhuriya, Ankur, Gaur, Rishabh, Raghavan, Vivek
We present Vakyansh, an end to end toolkit for Speech Recognition in Indic languages. India is home to almost 121 languages and around 125 crore speakers. Yet most of the languages are low resource in terms of data and pretrained models. Through Vaky
Externí odkaz:
http://arxiv.org/abs/2203.16512