Výsledky vyhledávání - "Zheng, Thomas"

Report

Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models

Autor: Zhao, Yiyang, Wang, Shuai, Sun, Guangzhi, Chen, Zehua, Zhang, Chao, Xu, Mingxing, Zheng, Thomas Fang

In this paper, Whisper, a large-scale pre-trained model for automatic speech recognition, is proposed to apply to speaker verification. A partial multi-scale feature aggregation (PMFA) approach is proposed based on a subset of Whisper encoder blocks

Externí odkaz: http://arxiv.org/abs/2408.15585

Zobrazit plný text záznamu

Report

A Joint Noise Disentanglement and Adversarial Training Framework for Robust Speaker Verification

Autor: Xing, Xujiang, Xu, Mingxing, Zheng, Thomas Fang

Publikováno v: Interspeech2024

Automatic Speaker Verification (ASV) suffers from performance degradation in noisy conditions. To address this issue, we propose a novel adversarial learning framework that incorporates noise-disentanglement to establish a noise-independent speaker i

Externí odkaz: http://arxiv.org/abs/2408.11562

Zobrazit plný text záznamu

Report

Speaker Adaptation for Quantised End-to-End ASR Models

Autor: Zhao, Qiuming, Sun, Guangzhi, Zhang, Chao, Xu, Mingxing, Zheng, Thomas Fang

End-to-end models have shown superior performance for automatic speech recognition (ASR). However, such models are often very large in size and thus challenging to deploy on resource-constrained edge devices. While quantisation can reduce model sizes

Externí odkaz: http://arxiv.org/abs/2408.03979

Zobrazit plný text záznamu

Report

SAML: Speaker Adaptive Mixture of LoRA Experts for End-to-End ASR

Autor: Zhao, Qiuming, Sun, Guangzhi, Zhang, Chao, Xu, Mingxing, Zheng, Thomas Fang

Mixture-of-experts (MoE) models have achieved excellent results in many tasks. However, conventional MoE models are often very large, making them challenging to deploy on resource-constrained edge devices. In this paper, we propose a novel speaker ad

Externí odkaz: http://arxiv.org/abs/2406.19706

Zobrazit plný text záznamu

Elektronická kniha

Autor: Zheng, Thomas Fang, author

Externí odkaz: Kolekce e-knih KNAV

Report

Enhancing Quantised End-to-End ASR Models via Personalisation

Autor: Zhao, Qiuming, Sun, Guangzhi, Zhang, Chao, Xu, Mingxing, Zheng, Thomas Fang

Recent end-to-end automatic speech recognition (ASR) models have become increasingly larger, making them particularly challenging to be deployed on resource-constrained devices. Model quantisation is an effective solution that sometimes causes the wo

Externí odkaz: http://arxiv.org/abs/2309.09136

Zobrazit plný text záznamu

Akademický článek

A random mutagenesis screen enriched for missense mutations in bacterial effector proteins.

Autor: Urbanus, Malene L¹ (AUTHOR), Zheng, Thomas M¹ (AUTHOR), Khusnutdinova, Anna N^2,3 (AUTHOR), Banh, Doreen¹ (AUTHOR), Mount, Harley O'Connor⁴ (AUTHOR), Gupta, Alind⁴ (AUTHOR), Stogios, Peter J² (AUTHOR), Savchenko, Alexei^2,5 (AUTHOR), Isberg, Ralph R⁶ (AUTHOR), Yakunin, Alexander F^2,3 (AUTHOR), Ensminger, Alexander W^1,4 (AUTHOR) alex.ensminger@utoronto.ca

Publikováno v: G3: Genes | Genomes | Genetics. Sep2024, Vol. 14 Issue 9, p1-12. 12p.

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

How Speech is Recognized to Be Emotional - A Study Based on Information Decomposition

Autor: Sun, Haoran, Li, Lantian, Zheng, Thomas Fang, Wang, Dong

The way that humans encode their emotion into speech signals is complex. For instance, an angry man may increase his pitch and speaking rate, and use impolite words. In this paper, we present a preliminary study on various emotional factors and inves

Externí odkaz: http://arxiv.org/abs/2111.12324

Zobrazit plný text záznamu

Report

A Multi-Resolution Front-End for End-to-End Speech Anti-Spoofing

Autor: Liu, Wei, Sun, Meng, Zhang, Xiongwei, Van hamme, Hugo, Zheng, Thomas Fang

The choice of an optimal time-frequency resolution is usually a difficult but important step in tasks involving speech signal classification, e.g., speech anti-spoofing. The variations of the performance with different choices of timefrequency resolu

Externí odkaz: http://arxiv.org/abs/2110.05087

Zobrazit plný text záznamu

Report

Attack on practical speaker verification system using universal adversarial perturbations

Autor: Zhang, Weiyi, Zhao, Shuning, Liu, Le, Li, Jianmin, Cheng, Xingliang, Zheng, Thomas Fang, Hu, Xiaolin

In authentication scenarios, applications of practical speaker verification systems usually require a person to read a dynamic authentication text. Previous studies played an audio adversarial example as a digital signal to perform physical attacks,

Externí odkaz: http://arxiv.org/abs/2105.09022

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání