Výsledky vyhledávání - "ZHANG Wei-qiang"

Report

CoopASD: Cooperative Machine Anomalous Sound Detection with Privacy Concerns

Autor: Jiang, Anbai, Shi, Yuchen, Fan, Pingyi, Zhang, Wei-Qiang, Liu, Jia

Machine anomalous sound detection (ASD) has emerged as one of the most promising applications in the Industrial Internet of Things (IIoT) due to its unprecedented efficacy in mitigating risks of malfunctions and promoting production efficiency. Previ

Externí odkaz: http://arxiv.org/abs/2408.14753

Zobrazit plný text záznamu

Report

Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text

Autor: Li, Jinpeng, Pu, Yu, Sun, Qi, Zhang, Wei-Qiang

Whisper and other large-scale automatic speech recognition models have made significant progress in performance. However, their performance on many low-resource languages, such as Kazakh, is not satisfactory. It is worth researching how to utilize lo

Externí odkaz: http://arxiv.org/abs/2408.05554

Zobrazit plný text záznamu

Report

AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

Autor: Jiang, Anbai, Han, Bing, Lv, Zhiqiang, Deng, Yufeng, Zhang, Wei-Qiang, Chen, Xie, Qian, Yanmin, Liu, Jia, Fan, Pingyi

Large pre-trained models have demonstrated dominant performances in multiple areas, where the consistency between pre-training and fine-tuning is the key to success. However, few works reported satisfactory results of pre-trained models for the machi

Externí odkaz: http://arxiv.org/abs/2406.11364

Zobrazit plný text záznamu

Report

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

Autor: Yang, Yifan, Song, Zheshu, Zhuo, Jianheng, Cui, Mingyu, Li, Jinpeng, Yang, Bo, Du, Yexing, Ma, Ziyang, Liu, Xunying, Wang, Ziyuan, Li, Ke, Fan, Shuai, Yu, Kai, Zhang, Wei-Qiang, Chen, Guoguo, Chen, Xie

The evolution of speech technology has been spurred by the rapid increase in dataset sizes. Traditional speech models generally depend on a large amount of labeled training data, which is scarce for low-resource languages. This paper presents GigaSpe

Externí odkaz: http://arxiv.org/abs/2406.11546

Zobrazit plný text záznamu

Report

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

Autor: Wang, Haoyu, Hu, Guoqiang, Lin, Guodong, Zhang, Wei-Qiang, Li, Jian

As a robust and large-scale multilingual speech recognition model, Whisper has demonstrated impressive results in many low-resource and out-of-distribution scenarios. However, its encoder-decoder structure hinders its application to streaming speech

Externí odkaz: http://arxiv.org/abs/2406.10052

Zobrazit plný text záznamu

Report

SpeechColab Leaderboard: An Open-Source Platform for Automatic Speech Recognition Evaluation

Autor: Du, Jiayu, Li, Jinpeng, Chen, Guoguo, Zhang, Wei-Qiang

In the wake of the surging tide of deep learning over the past decade, Automatic Speech Recognition (ASR) has garnered substantial attention, leading to the emergence of numerous publicly accessible ASR systems that are actively being integrated into

Externí odkaz: http://arxiv.org/abs/2403.08196

Zobrazit plný text záznamu

Report

Transferring speech-generic and depression-specific knowledge for Alzheimer's disease detection

Autor: Cui, Ziyun, Wu, Wen, Zhang, Wei-Qiang, Wu, Ji, Zhang, Chao

The detection of Alzheimer's disease (AD) from spontaneous speech has attracted increasing attention while the sparsity of training data remains an important issue. This paper handles the issue by knowledge transfer, specifically from both speech-gen

Externí odkaz: http://arxiv.org/abs/2310.04358

Zobrazit plný text záznamu

Report

Task-Agnostic Structured Pruning of Speech Representation Models

Autor: Wang, Haoyu, Wang, Siyuan, Zhang, Wei-Qiang, Suo, Hongbin, Wan, Yulong

Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM have been shown to significantly improve many speech tasks. However, their large memory and strong computational requirements hinder their industrial applicability. Structured pru

Externí odkaz: http://arxiv.org/abs/2306.01385

Zobrazit plný text záznamu

Report

DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model

Autor: Wang, Haoyu, Wang, Siyuan, Zhang, Wei-Qiang, Bai, Jinfeng

Multilingual self-supervised speech representation models have greatly enhanced the speech recognition performance for low-resource languages, and the compression of these huge models has also become a crucial prerequisite for their industrial applic

Externí odkaz: http://arxiv.org/abs/2306.01303

Zobrazit plný text záznamu

Report

Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning

Autor: Zhang, Hao, Si, Nianwen, Chen, Yaqi, Zhang, Wenlin, Yang, Xukui, Qu, Dan, Zhang, Wei-Qiang

Publikováno v: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 31, 2023

The end-to-end speech translation (E2E-ST) model has gradually become a mainstream paradigm due to its low latency and less error propagation. However, it is non-trivial to train such a model well due to the task complexity and data scarcity. The spe

Externí odkaz: http://arxiv.org/abs/2304.10309

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání