Výsledky vyhledávání

Report

MECG-E: Mamba-based ECG Enhancer for Baseline Wander Removal

Autor: Hung, Kuo-Hsuan, Wang, Kuan-Chen, Liu, Kai-Chun, Chen, Wei-Lun, Lu, Xugang, Tsao, Yu, Lin, Chii-Wann

Electrocardiogram (ECG) is an important non-invasive method for diagnosing cardiovascular disease. However, ECG signals are susceptible to noise contamination, such as electrical interference or signal wandering, which reduces diagnostic accuracy. Va

Externí odkaz: http://arxiv.org/abs/2409.18828

Zobrazit plný text záznamu

Report

Channel Adaptation for Speaker Verification Using Optimal Transport with Pseudo Label

Autor: Yang, Wenhao, Wei, Jianguo, Lu, Wenhuan, Li, Lei, Lu, Xugang

Domain gap often degrades the performance of speaker verification (SV) systems when the statistical distributions of training data and real-world test speech are mismatched. Channel variation, a primary factor causing this gap, is less addressed than

Externí odkaz: http://arxiv.org/abs/2409.09396

Zobrazit plný text záznamu

Report

Integrated Multi-Level Knowledge Distillation for Enhanced Speaker Verification

Autor: Yang, Wenhao, Wei, Jianguo, Lu, Wenhuan, Lu, Xugang, Li, Lei

Knowledge distillation (KD) is widely used in audio tasks, such as speaker verification (SV), by transferring knowledge from a well-trained large model (the teacher) to a smaller, more compact model (the student) for efficiency and portability. Exist

Externí odkaz: http://arxiv.org/abs/2409.09389

Zobrazit plný text záznamu

Report

Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR

Autor: Lu, Xugang, Shen, Peng, Tsao, Yu, Kawai, Hisashi

Transferring linguistic knowledge from a pretrained language model (PLM) to an acoustic model has been shown to greatly improve the performance of automatic speech recognition (ASR). However, due to the heterogeneous feature distributions in cross-mo

Externí odkaz: http://arxiv.org/abs/2409.02239

Zobrazit plný text záznamu

Report

Robust Channel Learning for Large-Scale Radio Speaker Verification

Autor: Yang, Wenhao, Wei, Jianguo, Lu, Wenhuan, Li, Lei, Lu, Xugang

Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to i

Externí odkaz: http://arxiv.org/abs/2406.10956

Zobrazit plný text záznamu

Report

A Non-Intrusive Neural Quality Assessment Model for Surface Electromyography Signals

Autor: Lee, Cho-Yuan, Wang, Kuan-Chen, Liu, Kai-Chun, Wang, Yu-Te, Lu, Xugang, Yeh, Ping-Cheng, Tsao, Yu

In practical scenarios involving the measurement of surface electromyography (sEMG) in muscles, particularly those areas near the heart, one of the primary sources of contamination is the presence of electrocardiogram (ECG) signals. To assess the qua

Externí odkaz: http://arxiv.org/abs/2402.05482

Zobrazit plný text záznamu

Report

Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions

Autor: Liu, Yang, Sun, Haoqin, Chen, Geng, Wang, Qingyue, Zhao, Zhen, Lu, Xugang, Wang, Longbiao

Speech emotion recognition (SER) performance deteriorates significantly in the presence of noise, making it challenging to achieve competitive performance in noisy conditions. To this end, we propose a multi-level knowledge distillation (MLKD) method

Externí odkaz: http://arxiv.org/abs/2312.13556

Zobrazit plný text záznamu

Report

Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition

Autor: Shen, Peng, Lu, Xugang, Kawai, Hisashi

Multi-talker overlapped speech recognition remains a significant challenge, requiring not only speech recognition but also speaker diarization tasks to be addressed. In this paper, to better address these tasks, we first introduce speaker labels into

Externí odkaz: http://arxiv.org/abs/2312.10959

Zobrazit plný text záznamu

Report

Neural domain alignment for spoken language recognition based on optimal transport

Autor: Lu, Xugang, Shen, Peng, Tsao, Yu, Kawai, Hisashi

Domain shift poses a significant challenge in cross-domain spoken language recognition (SLR) by reducing its effectiveness. Unsupervised domain adaptation (UDA) algorithms have been explored to address domain shifts in SLR without relying on class la

Externí odkaz: http://arxiv.org/abs/2310.13471

Zobrazit plný text záznamu

Report

Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR

Autor: Lu, Xugang, Shen, Peng, Tsao, Yu, Kawai, Hisashi

Due to the modality discrepancy between textual and acoustic modeling, efficiently transferring linguistic knowledge from a pretrained language model (PLM) to acoustic encoding for automatic speech recognition (ASR) still remains a challenging task.

Externí odkaz: http://arxiv.org/abs/2309.16093

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání