Zobrazeno 1 - 10
of 28
pro vyhledávání: '"Xu, Mingbin"'
Autor:
Lei, Zhihong, Na, Xingyu, Xu, Mingbin, Pusateri, Ernest, Van Gysel, Christophe, Zhang, Yuanyuan, Han, Shiyi, Huang, Zhen
Large language models (LLMs) have shown superb capability of modeling multimodal signals including audio and text, allowing the model to generate spoken or textual response given a speech input. However, it remains a challenge for the model to recogn
Externí odkaz:
http://arxiv.org/abs/2409.15353
In recent years, the evolution of end-to-end (E2E) automatic speech recognition (ASR) models has been remarkable, largely due to advances in deep learning architectures like transformer. On top of E2E systems, researchers have achieved substantial ac
Externí odkaz:
http://arxiv.org/abs/2406.03274
Autor:
Xu, Mingbin, Jin, Alex, Wang, Sicheng, Su, Mu, Ng, Tim, Mason, Henry, Han, Shiyi, Lei, Zhihong, Deng, Yaqiao, Huang, Zhen, Krishnamoorthy, Mahesh
With increasingly more powerful compute capabilities and resources in today's devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect user privacy. However, it is still c
Externí odkaz:
http://arxiv.org/abs/2312.10359
Autor:
Lei, Zhihong, Pusateri, Ernest, Han, Shiyi, Liu, Leo, Xu, Mingbin, Ng, Tim, Travadi, Ruchir, Zhang, Youyuan, Hannemann, Mirko, Siu, Man-Hung, Huang, Zhen
Recent advances in deep learning and automatic speech recognition have improved the accuracy of end-to-end speech recognition systems, but recognition of personal content such as contact names remains a challenge. In this work, we describe our person
Externí odkaz:
http://arxiv.org/abs/2310.09988
Autor:
Lei, Zhihong, Xu, Mingbin, Han, Shiyi, Liu, Leo, Huang, Zhen, Ng, Tim, Zhang, Yuanyuan, Pusateri, Ernest, Hannemann, Mirko, Deng, Yaqiao, Siu, Man-Hung
Recent advances in deep learning and automatic speech recognition (ASR) have enabled the end-to-end (E2E) ASR system and boosted the accuracy to a new level. The E2E systems implicitly model all conventional ASR components, such as the acoustic model
Externí odkaz:
http://arxiv.org/abs/2310.07062
Autor:
Xu, Mingbin, Song, Congzheng, Tian, Ye, Agrawal, Neha, Granqvist, Filip, van Dalen, Rogier, Zhang, Xiao, Argueta, Arturo, Han, Shiyi, Deng, Yaqiao, Liu, Leo, Walia, Anmol, Jin, Alex
Federated Learning (FL) is a technique to train models using data distributed across devices. Differential Privacy (DP) provides a formal privacy guarantee for sensitive data. Our goal is to train a large neural network language model (NNLM) on compu
Externí odkaz:
http://arxiv.org/abs/2207.08988
In this paper, we explore a new approach to named entity recognition (NER) with the goal of learning from context and fragment features more effectively, contributing to the improvement of overall recognition performance. We use the recent fixed-size
Externí odkaz:
http://arxiv.org/abs/1904.03305
Named entity recognition (NER) systems that perform well require task-related and manually annotated datasets. However, they are expensive to develop, and are thus limited in size. As there already exists a large number of NER datasets that share a c
Externí odkaz:
http://arxiv.org/abs/1904.03300
Question answering over knowledge base (KB-QA) has recently become a popular research topic in NLP. One popular way to solve the KB-QA problem is to make use of a pipeline of several NLP modules, including entity discovery and linking (EDL) and relat
Externí odkaz:
http://arxiv.org/abs/1903.12356
In this paper, we present our method of using fixed-size ordinally forgetting encoding (FOFE) to solve the word sense disambiguation (WSD) problem. FOFE enables us to encode variable-length sequence of words into a theoretically unique fixed-size rep
Externí odkaz:
http://arxiv.org/abs/1902.10246