Zobrazeno 1 - 10
of 83
pro vyhledávání: '"Shao Yiwen"'
Autor:
Xie Xuemiao, Shao Yiwen
Publikováno v:
Redai dili, Vol 44, Iss 6, Pp 1090-1101 (2024)
The rapid growth of social media has introduced new concepts and technical approaches for disaster management. This paper reviews the characteristics of social media data and its application potential in disaster management research, providing a new
Externí odkaz:
https://doaj.org/article/beba5546393b4b69833c248593410dae
Autor:
Shi, Mohan, Jin, Zengrui, Xu, Yaoxun, Xu, Yong, Zhang, Shi-Xiong, Wei, Kun, Shao, Yiwen, Zhang, Chunlei, Yu, Dong
Recognizing overlapping speech from multiple speakers in conversational scenarios is one of the most challenging problem for automatic speech recognition (ASR). Serialized output training (SOT) is a classic method to address multi-talker ASR, with th
Externí odkaz:
http://arxiv.org/abs/2408.17431
Autor:
Wanger, Thomas Cherico, Raveloaritiana, Estelle, Zeng, Siyan, Gao, Haixiu, He, Xueqing, Shao, Yiwen, Wu, Panlong, Wyckhuys, Kris A. G., Zhou, Wenwu, Zou, Yi, Zhu, Zengrong, Li, Ling, Cen, Haiyan, Liu, Yunhui, Fan, Shenggen
China is the leading crop producer and has successfully implemented sustainable development programs related to agriculture. Sustainable agriculture has been promoted to achieve national food security targets such as food self-sufficiency through the
Externí odkaz:
http://arxiv.org/abs/2407.01364
Autor:
Shao, Yiwen, Zhang, Shi-Xiong, Xu, Yong, Yu, Meng, Yu, Dong, Povey, Daniel, Khudanpur, Sanjeev
In the field of multi-channel, multi-speaker Automatic Speech Recognition (ASR), the task of discerning and accurately transcribing a target speaker's speech within background noise remains a formidable challenge. Traditional approaches often rely on
Externí odkaz:
http://arxiv.org/abs/2406.09589
Automatic speech recognition (ASR) on multi-talker recordings is challenging. Current methods using 3D spatial data from multi-channel audio and visual cues focus mainly on direct waves from the target speaker, overlooking reflection wave impacts, wh
Externí odkaz:
http://arxiv.org/abs/2311.00146
The speech field is evolving to solve more challenging scenarios, such as multi-channel recordings with multiple simultaneous talkers. Given the many types of microphone setups out there, we present the UniX-Encoder. It's a universal encoder designed
Externí odkaz:
http://arxiv.org/abs/2310.16367
Autor:
Shao, Yiwen
Multi-channel multi-talker speech recognition presents formidable challenges in the realm of speech processing, marked by issues such as background noise, reverberation, and overlapping speech. Overcoming these complexities requires leveraging contex
Externí odkaz:
http://arxiv.org/abs/2310.03901
Autor:
Joshi, Sonal, Kataria, Saurabh, Shao, Yiwen, Zelasko, Piotr, Villalba, Jesus, Khudanpur, Sanjeev, Dehak, Najim
Adversarial attacks are a threat to automatic speech recognition (ASR) systems, and it becomes imperative to propose defenses to protect them. In this paper, we perform experiments to show that K2 conformer hybrid ASR is strongly affected by white-bo
Externí odkaz:
http://arxiv.org/abs/2204.03851
Automatic speech recognition (ASR) of multi-channel multi-speaker overlapped speech remains one of the most challenging tasks to the speech community. In this paper, we look into this challenge by utilizing the location information of target speakers
Externí odkaz:
http://arxiv.org/abs/2111.11023
Autor:
Żelasko, Piotr, Joshi, Sonal, Shao, Yiwen, Villalba, Jesus, Trmal, Jan, Dehak, Najim, Khudanpur, Sanjeev
The ubiquitous presence of machine learning systems in our lives necessitates research into their vulnerabilities and appropriate countermeasures. In particular, we investigate the effectiveness of adversarial attacks and defenses against automatic s
Externí odkaz:
http://arxiv.org/abs/2103.17122