Zobrazeno 1 - 10
of 2 178
pro vyhledávání: '"Zhang, BinBin"'
Autor:
Liu, Qize, Pan, Xiaofan, Zheng, Xutao, Gao, Huaizhong, Li, Longhao, Wang, Qidong, Yang, Zirui, Tang, Chenchong, Wu, Wenxuan, Cheng, Jianping, Zeng, Zhi, Zeng, Ming, Feng, Hua, Zhang, Binbin, Wang, Zhonghai, Zhou, Rong, Liu, Yuanyuan, Lin, Lin, Zhong, Jiayong, Jiang, Jianyong, Han, Wentao, Tian, Yang, Xu, Benda, Collaboration, GRID
The Gamma-Ray Integrated Detectors (GRID) are a space science mission that employs compact gamma-ray detectors mounted on NanoSats in low Earth orbit (LEO) to monitor the transient gamma-ray sky. Owing to the unpredictability of the time and location
Externí odkaz:
http://arxiv.org/abs/2410.13402
Autor:
Xue, Hongfei, Gong, Rong, Shao, Mingchen, Xu, Xin, Wang, Lezhi, Xie, Lei, Bu, Hui, Zhou, Jiaming, Qin, Yong, Du, Jun, Li, Ming, Zhang, Binbin, Jia, Bin
The StutteringSpeech Challenge focuses on advancing speech technologies for people who stutter, specifically targeting Stuttering Event Detection (SED) and Automatic Speech Recognition (ASR) in Mandarin. The challenge comprises three tracks: (1) SED,
Externí odkaz:
http://arxiv.org/abs/2409.05430
In automatic speech recognition, subsampling is essential for tackling diverse scenarios. However, the inadequacy of a single subsampling rate to address various real-world situations often necessitates training and deploying multiple models, consequ
Externí odkaz:
http://arxiv.org/abs/2408.04325
Autor:
Gong, Rong, Xue, Hongfei, Wang, Lezhi, Xu, Xin, Li, Qisheng, Xie, Lei, Bu, Hui, Wu, Shaomei, Zhou, Jiaming, Qin, Yong, Zhang, Binbin, Du, Jun, Bin, Jia, Li, Ming
The rapid advancements in speech technologies over the past two decades have led to human-level performance in tasks like automatic speech recognition (ASR) for fluent speech. However, the efficacy of these models diminishes when applied to atypical
Externí odkaz:
http://arxiv.org/abs/2406.07256
Autor:
Ma, Linhan, Guo, Dake, Song, Kun, Jiang, Yuepeng, Wang, Shuai, Xue, Liumeng, Xu, Weiming, Zhao, Huan, Zhang, Binbin, Xie, Lei
With the development of large text-to-speech (TTS) models and scale-up of the training data, state-of-the-art TTS systems have achieved impressive performance. In this paper, we present WenetSpeech4TTS, a multi-domain Mandarin corpus derived from the
Externí odkaz:
http://arxiv.org/abs/2406.05763
In this paper we derive new second-order optimality conditions for a very general set-constrained optimization problem where the underlying set may be nononvex. We consider local optimality in specific directions (i.e., optimal in a directional neigh
Externí odkaz:
http://arxiv.org/abs/2404.17696
Autor:
Song, Xingchen, Wu, Di, Zhang, Binbin, Zhou, Dinghao, Peng, Zhendong, Dang, Bo, Pan, Fuping, Yang, Chao
Scale has opened new frontiers in natural language processing, but at a high cost. In response, by learning to only activate a subset of parameters in training and inference, Mixture-of-Experts (MoE) have been proposed as an energy efficient path to
Externí odkaz:
http://arxiv.org/abs/2404.16407
Autor:
Wang, He, Guo, Pengcheng, Li, Yue, Zhang, Ao, Sun, Jiayao, Xie, Lei, Chen, Wei, Zhou, Pan, Bu, Hui, Xu, Xin, Zhang, Binbin, Chen, Zhuo, Wu, Jian, Wang, Longbiao, Chng, Eng Siong, Li, Sun
To promote speech processing and recognition research in driving scenarios, we build on the success of the Intelligent Cockpit Speech Recognition Challenge (ICSRC) held at ISCSLP 2022 and launch the ICASSP 2024 In-Car Multi-Channel Automatic Speech R
Externí odkaz:
http://arxiv.org/abs/2401.03473
Autor:
Zhu, Jiahuan, Zheng, Xutao, Feng, Hua, Zeng, Ming, Huang, Chien-You, Hsiang, Jr-Yue, Chang, Hsiang-Kuang, Li, Hong, Chang, Hao, Pan, Xiaofan, Ma, Ge, Wu, Qiong, Li, Yulan, Bai, Xuening, Ge, Mingyu, Ji, Long, Li, Jian, Shen, Yangping, Wang, Wei, Wang, Xilu, Zhang, Binbin, Zhang, Jin
We propose a future mission concept, the MeV Astrophysical Spectroscopic Surveyor (MASS), which is a large area Compton telescope using 3D position sensitive cadmium zinc telluride (CZT) detectors optimized for emission line detection. The payload co
Externí odkaz:
http://arxiv.org/abs/2312.11900
This study describes our system for Task 1 Single-speaker Visual Speech Recognition (VSR) fixed track in the Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) 2023. Specifically, we use intermediate connectionist temporal classification
Externí odkaz:
http://arxiv.org/abs/2312.07254