Výsledky vyhledávání - "Peng, Zhendong"

Report

HydraFormer: One Encoder For All Subsampling Rates

Autor: Xu, Yaoxun, Song, Xingchen, Wu, Zhiyong, Wu, Di, Peng, Zhendong, Zhang, Binbin

In automatic speech recognition, subsampling is essential for tackling diverse scenarios. However, the inadequacy of a single subsampling rate to address various real-world situations often necessitates training and deploying multiple models, consequ

Externí odkaz: http://arxiv.org/abs/2408.04325

Zobrazit plný text záznamu

Report

Synchronization Scheme based on Pilot Sharing in Cell-Free Massive MIMO Systems

Autor: Peng, Qihao, Ren, Hong, Peng, Zhendong, Pan, Cunhua, Elkashlan, Maged, Wang, Dongming, Wang, Jiangzhou, You, Xiaohu

This paper analyzes the impact of pilot-sharing scheme on synchronization performance in a scenario where several slave access points (APs) with uncertain carrier frequency offsets (CFOs) and timing offsets (TOs) share a common pilot sequence. First,

Externí odkaz: http://arxiv.org/abs/2405.18775

Zobrazit plný text záznamu

Report

U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF

Autor: Song, Xingchen, Wu, Di, Zhang, Binbin, Zhou, Dinghao, Peng, Zhendong, Dang, Bo, Pan, Fuping, Yang, Chao

Scale has opened new frontiers in natural language processing, but at a high cost. In response, by learning to only activate a subset of parameters in training and inference, Mixture-of-Experts (MoE) have been proposed as an energy efficient path to

Externí odkaz: http://arxiv.org/abs/2404.16407

Zobrazit plný text záznamu

Report

Radar Rainbow Beams For Wideband mmWave Communication: Beam Training And Tracking

Autor: Zhou, Gui, Garkisch, Moritz, Peng, Zhendong, Pan, Cunhua, Schober, Robert

We propose a novel integrated sensing and communication (ISAC) system that leverages sensing to assist communication, ensuring fast initial access, seamless user tracking, and uninterrupted communication for millimeter wave (mmWave) wideband systems.

Externí odkaz: http://arxiv.org/abs/2403.09330

Zobrazit plný text záznamu

Report

LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech

Autor: Chen, Jie, Song, Xingchen, Peng, Zhendong, Zhang, Binbin, Pan, Fuping, Wu, Zhiyong

Recent advances in neural text-to-speech (TTS) models bring thousands of TTS applications into daily life, where models are deployed in cloud to provide services for customs. Among these models are diffusion probabilistic models (DPMs), which can be

Externí odkaz: http://arxiv.org/abs/2308.16569

Zobrazit plný text záznamu

Report

ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs

Autor: Song, Xingchen, Wu, Di, Zhang, Binbin, Peng, Zhendong, Dang, Bo, Pan, Fuping, Wu, Zhiyong

Publikováno v: inproceedings{song23c_interspeech, year=2023, booktitle={Proc. INTERSPEECH 2023}, pages={1648--1652}}

In this paper, we present ZeroPrompt (Figure 1-(a)) and the corresponding Prompt-and-Refine strategy (Figure 3), two simple but effective \textbf{training-free} methods to decrease the Token Display Time (TDT) of streaming ASR models \textbf{without

Externí odkaz: http://arxiv.org/abs/2305.10649

Zobrazit plný text záznamu

Report

Power Adaptation for Suborbital Downlink with Stochastic Satellites Interference

Autor: He, Yihao, Ma, Juntao, Peng, Zhendong, Wu, Gang

This paper investigates downlink power adaptation for the suborbital node in suborbital-ground communication systems, which are subject to extremely high reliability and ultra-low latency communications requirements. The problem is formulated as a po

Externí odkaz: http://arxiv.org/abs/2303.05680

Zobrazit plný text záznamu

Report

Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames

Autor: Liang, Chengdong, Zhang, Xiao-Lei, Zhang, BinBin, Wu, Di, Li, Shengqiang, Song, Xingchen, Peng, Zhendong, Pan, Fuping

Recently, the unified streaming and non-streaming two-pass (U2/U2++) end-to-end model for speech recognition has shown great performance in terms of streaming capability, accuracy and latency. In this paper, we present fast-U2++, an enhanced version

Externí odkaz: http://arxiv.org/abs/2211.00941

Zobrazit plný text záznamu

Report

TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty

Autor: Song, Xingchen, Wu, Di, Wu, Zhiyong, Zhang, Binbin, Zhang, Yuekai, Peng, Zhendong, Li, Wenpeng, Pan, Fuping, Zhu, Changbao

In this paper, we present TrimTail, a simple but effective emission regularization method to improve the latency of streaming ASR models. The core idea of TrimTail is to apply length penalty (i.e., by trimming trailing frames, see Fig. 1-(b)) directl

Externí odkaz: http://arxiv.org/abs/2211.00522

Zobrazit plný text záznamu

Report

FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition

Autor: Song, Xingchen, Wu, Di, Zhang, Binbin, Wu, Zhiyong, Li, Wenpeng, Li, Dongfang, Zhang, Pengshen, Peng, Zhendong, Pan, Fuping, Zhu, Changbao, Wu, Zhongqin

The recently proposed Conformer architecture which combines convolution with attention to capture both local and global dependencies has become the \textit{de facto} backbone model for Automatic Speech Recognition~(ASR). Inherited from the Natural La

Externí odkaz: http://arxiv.org/abs/2210.17079

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání