Zobrazeno 1 - 10
of 629
pro vyhledávání: '"SUN, ERIC"'
We present a novel approach for reconstructing annual temperatures in East Asia from 1368 to 1911, leveraging the Reconstructed East Asian Climate Historical Encoded Series (REACHES). The lack of instrumental data during this period poses significant
Externí odkaz:
http://arxiv.org/abs/2410.21790
Obtaining word timestamp information from end-to-end (E2E) ASR models remains challenging due to the lack of explicit time alignment during training. This issue is further complicated in multilingual models. Existing methods, either rely on lexicons
Externí odkaz:
http://arxiv.org/abs/2409.13913
Publikováno v:
Proceedings of the National Academy of Sciences, 2024, 121(10) e2313719121
Single-cell data integration can provide a comprehensive molecular view of cells, and many algorithms have been developed to remove unwanted technical or biological variations and integrate heterogeneous single-cell datasets. Despite their wide usage
Externí odkaz:
http://arxiv.org/abs/2308.01839
In end-to-end automatic speech recognition system, one of the difficulties for language expansion is the limited paired speech and text training data. In this paper, we propose a novel method to generate augmented samples with unpaired speech feature
Externí odkaz:
http://arxiv.org/abs/2307.16332
Autor:
Sun, Eric, Li, Jinyu, Hu, Yuxuan, Zhu, Yimeng, Zhou, Long, Xue, Jian, Wang, Peidong, Liu, Linquan, Liu, Shujie, Lin, Edward, Gong, Yifan
We propose gated language experts and curriculum training to enhance multilingual transformer transducer models without requiring language identification (LID) input from users during inference. Our method incorporates a gating mechanism and LID loss
Externí odkaz:
http://arxiv.org/abs/2303.00786
Autor:
Wang, Peidong, Sun, Eric, Xue, Jian, Wu, Yu, Zhou, Long, Gaur, Yashesh, Liu, Shujie, Li, Jinyu
Automatic speech recognition (ASR) and speech translation (ST) can both use neural transducers as the model structure. It is thus possible to use a single transducer model to perform both tasks. In real-world applications, such joint ASR and ST model
Externí odkaz:
http://arxiv.org/abs/2211.02809
In this paper, we introduce our work of building a Streaming Multilingual Speech Model (SM2), which can transcribe or translate multiple spoken languages into texts of the target language. The backbone of SM2 is Transformer Transducer, which has high
Externí odkaz:
http://arxiv.org/abs/2211.02499
Dimension reduction and data visualization aim to project a high-dimensional dataset to a low-dimensional space while capturing the intrinsic structures in the data. It is an indispensable part of modern data science, and many dimensional reduction a
Externí odkaz:
http://arxiv.org/abs/2210.13711
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition
Autor:
Kumatani, Kenichi, Gmyr, Robert, Salinas, Felipe Cruz, Liu, Linquan, Zuo, Wei, Patel, Devang, Sun, Eric, Shi, Yu
The sparsely-gated Mixture of Experts (MoE) can magnify a network capacity with a little computational complexity. In this work, we investigate how multi-lingual Automatic Speech Recognition (ASR) networks can be scaled up with a simple routing algor
Externí odkaz:
http://arxiv.org/abs/2112.05820
Multilingual end-to-end(E2E) models have shown a great potential in the expansion of the language coverage in the realm of automatic speech recognition(ASR). In this paper, we aim to enhance the multilingual ASR performance in two ways, 1)studying th
Externí odkaz:
http://arxiv.org/abs/2110.07909