Zobrazeno 1 - 10
of 30
pro vyhledávání: '"Sanyuan Chen"'
Recently, masked prediction pre-training has seen remarkable progress in self-supervised learning (SSL) for speech recognition. It usually requires a codebook obtained in an unsupervised way, making it less accurate and difficult to interpret. We pro
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::192c40cae2ab823b0cc9f6b8c73edce5
http://arxiv.org/abs/2206.10125
http://arxiv.org/abs/2206.10125
Publikováno v:
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Transformer has been successfully applied to speech separation recently with its strong long-dependency modeling capacity using a self-attention mechanism. However, Transformer tends to have heavy run-time costs due to the deep encoder layers, which
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3e9aca068857e3b65193d13d87acb164
http://arxiv.org/abs/2204.12777
http://arxiv.org/abs/2204.12777
Autor:
Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Zhuo Chen, Peidong Wang, Gang Liu, Jinyu Li, Jian Wu, Xiangzhan Yu, Furu Wei
Recently, self-supervised learning (SSL) has demonstrated strong performance in speaker recognition, even if the pre-training objective is designed for speech recognition. In this paper, we study which factor leads to the success of self-supervised l
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::15a8deceaf95727a6f7dac27b11fd5ce
Publikováno v:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
Autor:
Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Michael Zeng, Xiangzhan Yu, Furu Wei
Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other speech processing tasks. As speech signal contains multi-faceted information including speaker identity, paralinguisti
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0a96fba8faf1ed11e34d9988e7381319
http://arxiv.org/abs/2110.13900
http://arxiv.org/abs/2110.13900
Autor:
Naoyuki Kanda, Shujie Liu, Takuya Yoshioka, Sanyuan Chen, Zhuo Chen, Jian Wu, Jinyu Li, Yu Wu
Speech separation has been successfully applied as a frontend processing module of conversation transcription systems thanks to its ability to handle overlapped speech and its flexibility to combine with downstream tasks such as automatic speech reco
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::462ebfd4f97ec2b76582fd4d179c031b
http://arxiv.org/abs/2107.01922
http://arxiv.org/abs/2107.01922
Publikováno v:
ICASSP
With its strong modeling capacity that comes from a multi-head and multi-layer structure, Transformer is a very powerful model for learning a sequential representation and has been successfully applied to speech separation recently. However, multi-ch
Autor:
Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu
Self-supervised learning (SSL) is a long-standing goal for speech processing, since it utilizes large-scale unlabeled data and avoids extensive human labeling. Recent years witness great successes in applying self-supervised learning in speech recogn
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c42fa6da70bfb98dc2f1e219a7d43ca9
Autor:
Yong Zhao, Gang Liu, Shujie Liu, Jinyu Li, Tianyan Zhou, Zhuo Chen, Jian Wu, Naoyuki Kanda, Sanyuan Chen, Yifan Gong, Yu Wu, Takuya Yoshioka, Xiong Xiao
Publikováno v:
ICASSP
This paper describes the Microsoft speaker diarization system for monaural multi-talker recordings in the wild, evaluated at the diarization track of the VoxCeleb Speaker Recognition Challenge(VoxSRC) 2020. We will first explain our system design to
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b5604602be3e363abd4f168f34d7f7ce
http://arxiv.org/abs/2010.11458
http://arxiv.org/abs/2010.11458