Zobrazeno 1 - 10
of 90
pro vyhledávání: '"Garg, Abhinav"'
Time series forecasting in the air cargo industry presents unique challenges due to volatile market dynamics and the significant impact of accurate forecasts on generated revenue. This paper explores a comprehensive approach to demand forecasting at
Externí odkaz:
http://arxiv.org/abs/2407.20192
Grapheme-to-Phoneme (G2P) is an essential first step in any modern, high-quality Text-to-Speech (TTS) system. Most of the current G2P systems rely on carefully hand-crafted lexicons developed by experts. This poses a two-fold problem. Firstly, the le
Externí odkaz:
http://arxiv.org/abs/2401.10465
Traditional AI approaches in customized (personalized) contextual pricing applications assume that the data distribution at the time of online pricing is similar to that observed during training. However, this assumption may be violated in practice b
Externí odkaz:
http://arxiv.org/abs/2111.14938
In this paper, we propose a three-stage training methodology to improve the speech recognition accuracy of low-resource languages. We explore and propose an effective combination of techniques such as transfer learning, encoder freezing, data augment
Externí odkaz:
http://arxiv.org/abs/2111.10047
In this paper, we present a comparative study on the robustness of two different online streaming speech recognition models: Monotonic Chunkwise Attention (MoChA) and Recurrent Neural Network-Transducer (RNN-T). We explore three recently proposed dat
Externí odkaz:
http://arxiv.org/abs/2111.10043
In this paper, we present a streaming end-to-end speech recognition model based on Monotonic Chunkwise Attention (MoCha) jointly trained with enhancement layers. Even though the MoCha attention enables streaming speech recognition with recognition ac
Externí odkaz:
http://arxiv.org/abs/2105.01254
Autor:
Kim, Chanwoo, Gowda, Dhananjaya, Lee, Dongsoo, Kim, Jiyeon, Kumar, Ankur, Kim, Sungsoo, Garg, Abhinav, Han, Changwoo
In this paper, we review various end-to-end automatic speech recognition algorithms and their optimization techniques for on-device applications. Conventional speech recognition systems comprise a large number of discrete components such as an acoust
Externí odkaz:
http://arxiv.org/abs/2012.07974
Autor:
Venkatakrishnan, AJ, Puranik, Arjun, Anand, Akash, Zemmour, David, Yao, Xiang, Wu, Xiaoying, Chilaka, Ramakrishna, Murakowski, Dariusz K., Standish, Kristopher, Raghunathan, Bharathwaj, Wagner, Tyler, Garcia-Rivera, Enrique, Solomon, Hugo, Garg, Abhinav, Barve, Rakesh, Anyanwu-Ofili, Anuli, Khan, Najat, Soundararajan, Venky
The COVID-19 pandemic demands assimilation of all available biomedical knowledge to decode its mechanisms of pathogenicity and transmission. Despite the recent renaissance in unsupervised neural networks for decoding unstructured natural languages, a
Externí odkaz:
http://arxiv.org/abs/2003.12773
In this paper, we propose a refined multi-stage multi-task training strategy to improve the performance of online attention-based encoder-decoder (AED) models. A three-stage training based on three levels of architectural granularity namely, characte
Externí odkaz:
http://arxiv.org/abs/1912.12384
Autor:
Kim, Chanwoo, Kim, Sungsoo, Kim, Kwangyoun, Kumar, Mehul, Kim, Jiyeon, Lee, Kyungmin, Han, Changwoo, Garg, Abhinav, Kim, Eunhyang, Shin, Minkyoo, Singh, Shatrughan, Heck, Larry, Gowda, Dhananjaya
In this paper, we present an end-to-end training framework for building state-of-the-art end-to-end speech recognition systems. Our training system utilizes a cluster of Central Processing Units(CPUs) and Graphics Processing Units (GPUs). The entire
Externí odkaz:
http://arxiv.org/abs/1912.11040