Výsledky vyhledávání

Akademický článek

Application Potential of Social Media Data Analytics in Typhoon Disaster Management: Taking the Impact of Typhoon Doksuri on Fujian Province as an Example

Autor: Xie Xuemiao, Shao Yiwen

Publikováno v: Redai dili, Vol 44, Iss 6, Pp 1090-1101 (2024)

The rapid growth of social media has introduced new concepts and technical approaches for disaster management. This paper reviews the characteristics of social media data and its application potential in disaster management research, providing a new

Externí odkaz: https://doaj.org/article/beba5546393b4b69833c248593410dae

Zobrazit plný text záznamu

Report

Advancing Multi-talker ASR Performance with Large Language Models

Autor: Shi, Mohan, Jin, Zengrui, Xu, Yaoxun, Xu, Yong, Zhang, Shi-Xiong, Wei, Kun, Shao, Yiwen, Zhang, Chunlei, Yu, Dong

Recognizing overlapping speech from multiple speakers in conversational scenarios is one of the most challenging problem for automatic speech recognition (ASR). Serialized output training (SOT) is a classic method to address multi-talker ASR, with th

Externí odkaz: http://arxiv.org/abs/2408.17431

Zobrazit plný text záznamu

Report

Co-benefits of Agricultural Diversification and Technology for Food and Nutrition Security in China

Autor: Wanger, Thomas Cherico, Raveloaritiana, Estelle, Zeng, Siyan, Gao, Haixiu, He, Xueqing, Shao, Yiwen, Wu, Panlong, Wyckhuys, Kris A. G., Zhou, Wenwu, Zou, Yi, Zhu, Zengrong, Li, Ling, Cen, Haiyan, Liu, Yunhui, Fan, Shenggen

China is the leading crop producer and has successfully implemented sustainable development programs related to agriculture. Sustainable agriculture has been promoted to achieve national food security targets such as food self-sufficiency through the

Externí odkaz: http://arxiv.org/abs/2407.01364

Zobrazit plný text záznamu

Report

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment

Autor: Shao, Yiwen, Zhang, Shi-Xiong, Xu, Yong, Yu, Meng, Yu, Dong, Povey, Daniel, Khudanpur, Sanjeev

In the field of multi-channel, multi-speaker Automatic Speech Recognition (ASR), the task of discerning and accurately transcribing a target speaker's speech within background noise remains a formidable challenge. Traditional approaches often rely on

Externí odkaz: http://arxiv.org/abs/2406.09589

Zobrazit plný text záznamu

Report

RIR-SF: Room Impulse Response Based Spatial Feature for Target Speech Recognition in Multi-Channel Multi-Speaker Scenarios

Autor: Shao, Yiwen, Zhang, Shi-Xiong, Yu, Dong

Automatic speech recognition (ASR) on multi-talker recordings is challenging. Current methods using 3D spatial data from multi-channel audio and visual cues focus mainly on direct waves from the target speaker, overlooking reflection wave impacts, wh

Externí odkaz: http://arxiv.org/abs/2311.00146

Zobrazit plný text záznamu

Report

UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing

Autor: Huang, Zili, Shao, Yiwen, Zhang, Shi-Xiong, Yu, Dong

The speech field is evolving to solve more challenging scenarios, such as multi-channel recordings with multiple simultaneous talkers. Given the many types of microphone setups out there, we present the UniX-Encoder. It's a universal encoder designed

Externí odkaz: http://arxiv.org/abs/2310.16367

Zobrazit plný text záznamu

Report

Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset

Autor: Shao, Yiwen

Multi-channel multi-talker speech recognition presents formidable challenges in the realm of speech processing, marked by issues such as background noise, reverberation, and overlapping speech. Overcoming these complexities requires leveraging contex

Externí odkaz: http://arxiv.org/abs/2310.03901

Zobrazit plný text záznamu

Report

Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser

Autor: Joshi, Sonal, Kataria, Saurabh, Shao, Yiwen, Zelasko, Piotr, Villalba, Jesus, Khudanpur, Sanjeev, Dehak, Najim

Adversarial attacks are a threat to automatic speech recognition (ASR) systems, and it becomes imperative to propose defenses to protect them. In this paper, we perform experiments to show that K2 conformer hybrid ASR is strongly affected by white-bo

Externí odkaz: http://arxiv.org/abs/2204.03851

Zobrazit plný text záznamu

Report

Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature

Autor: Shao, Yiwen, Zhang, Shi-Xiong, Yu, Dong

Automatic speech recognition (ASR) of multi-channel multi-speaker overlapped speech remains one of the most challenging tasks to the speech community. In this paper, we look into this challenge by utilizing the location information of target speakers

Externí odkaz: http://arxiv.org/abs/2111.11023

Zobrazit plný text záznamu

Report

Adversarial Attacks and Defenses for Speech Recognition Systems

Autor: Żelasko, Piotr, Joshi, Sonal, Shao, Yiwen, Villalba, Jesus, Trmal, Jan, Dehak, Najim, Khudanpur, Sanjeev

The ubiquitous presence of machine learning systems in our lives necessitates research into their vulnerabilities and appropriate countermeasures. In particular, we investigate the effectiveness of adversarial attacks and defenses against automatic s

Externí odkaz: http://arxiv.org/abs/2103.17122

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání