Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Chitkara, Pooja"'
Autor:
Klumpp, Philipp, Chitkara, Pooja, Sarı, Leda, Serai, Prashant, Wu, Jilong, Veliche, Irina-Elena, Huang, Rongqing, He, Qing
The awareness for biased ASR datasets or models has increased notably in recent years. Even for English, despite a vast amount of available training data, systems perform worse for non-native speakers. In this work, we improve an accent-conversion mo
Externí odkaz:
http://arxiv.org/abs/2303.00802
Speech to text models tend to be trained and evaluated against a single target accent. This is especially true for English for which native speakers from the United States became the main benchmark. In this work, we are going to show how two simple m
Externí odkaz:
http://arxiv.org/abs/2212.12048
Autor:
Pandey, Laxmi, Paul, Debjyoti, Chitkara, Pooja, Pang, Yutong, Zhang, Xuedong, Schubert, Kjell, Chou, Mark, Liu, Shu, Saraf, Yatharth
Inverse text normalization (ITN) is used to convert the spoken form output of an automatic speech recognition (ASR) system to a written form. Traditional handcrafted ITN rules can be complex to transcribe and maintain. Meanwhile neural modeling appro
Externí odkaz:
http://arxiv.org/abs/2207.09674
Autor:
Liu, Chunxi, Picheny, Michael, Sarı, Leda, Chitkara, Pooja, Xiao, Alex, Zhang, Xiaohui, Chou, Mark, Alvarado, Andres, Hazirbas, Caner, Saraf, Yatharth
It is well known that many machine learning systems demonstrate bias towards specific groups of individuals. This problem has been studied extensively in the Facial Recognition area, but much less so in Automatic Speech Recognition (ASR). This paper
Externí odkaz:
http://arxiv.org/abs/2111.09983
Autor:
Li, Jialu, Manohar, Vimal, Chitkara, Pooja, Tjandra, Andros, Picheny, Michael, Zhang, Frank, Zhang, Xiaohui, Saraf, Yatharth
Speech recognition models often obtain degraded performance when tested on speech with unseen accents. Domain-adversarial training (DAT) and multi-task learning (MTL) are two common approaches for building accent-robust ASR models. ASR models using a
Externí odkaz:
http://arxiv.org/abs/2110.03520
Autor:
Dendukuri, Sahas, Chitkara, Pooja, Moniz, Joel Ruben Antony, Yang, Xiao, Tsagkias, Manos, Pulman, Stephen
Entity tags in human-machine dialog are integral to natural language understanding (NLU) tasks in conversational assistants. However, current systems struggle to accurately parse spoken queries with the typical use of text input alone, and often fail
Externí odkaz:
http://arxiv.org/abs/2109.13222
Autor:
Muralidharan, Deepak, Moniz, Joel Ruben Antony, Gao, Sida, Yang, Xiao, Kao, Justine, Pulman, Stephen, Kothari, Atish, Shen, Ray, Pan, Yinying, Kaul, Vivek, Ibrahim, Mubarak Seyed, Xiang, Gang, Dun, Nan, Zhou, Yidan, O, Andy, Zhang, Yuan, Chitkara, Pooja, Wang, Xuan, Patel, Alkesh, Tayal, Kushal, Zheng, Roger, Grasch, Peter, Williams, Jason D., Li, Lin
Named Entity Recognition (NER) and Entity Linking (EL) play an essential role in voice assistant interaction, but are challenging due to the special difficulties associated with spoken user queries. In this paper, we propose a novel architecture that
Externí odkaz:
http://arxiv.org/abs/2005.14408
Success of deep learning techniques have renewed the interest in development of dialogue systems. However, current systems struggle to have consistent long term conversations with the users and fail to build rapport. Topic spotting, the task of autom
Externí odkaz:
http://arxiv.org/abs/1904.02815