Zobrazeno 1 - 10
of 9 484
pro vyhledávání: '"An, Zheng-Hua"'
In this work, we present BiSSL, a first-of-its-kind training framework that introduces bilevel optimization to enhance the alignment between the pretext pre-training and downstream fine-tuning stages in self-supervised learning. BiSSL formulates the
Externí odkaz:
http://arxiv.org/abs/2410.02387
Autor:
Kühne, Nikolai L., Kitchen, Astrid H. F., Jensen, Marie S., Brøndt, Mikkel S. L., Gonzalez, Martin, Biscio, Christophe, Tan, Zheng-Hua
Automatic speech recognition (ASR) systems are known to be vulnerable to adversarial attacks. This paper addresses detection and defence against targeted white-box attacks on speech signals for ASR systems. While existing work has utilised diffusion
Externí odkaz:
http://arxiv.org/abs/2409.07936
Speech signals encompass various information across multiple levels including content, speaker, and style. Disentanglement of these information, although challenging, is important for applications such as voice conversion. The contrastive predictive
Externí odkaz:
http://arxiv.org/abs/2409.03520
While the transformer has emerged as the eminent neural architecture, several independent lines of research have emerged to address its limitations. Recurrent neural approaches have also observed a lot of renewed interest, including the extended long
Externí odkaz:
http://arxiv.org/abs/2408.16568
Autor:
Shi, Jia-Hao, Qin, Zhi-Ying, Zhang, Jin-Peng, Cao, Jian, Jiang, Ze-Fang, Zhang, Wen-Chao, Zheng, Hua
A non-extensive (3+1)-dimensional hydrodynamic model for multi-particle production processes, NEX-CLVisc, is developed in the framework of CLVisc where the viscous corrections are turned off. It assumes that the non-extensive effects consistently exi
Externí odkaz:
http://arxiv.org/abs/2408.12405
Autor:
Cao, Jian, She, Zhi-Lei, Zhang, Jin-Peng, Shi, Jia-Hao, Qin, Zhi-Ying, Zhang, Wen-Chao, Zheng, Hua, Lei, An-Ke, Zhou, Dai-Mei, Yan, Yu-Liang, Sa, Ben-Hao
Publikováno v:
Physical Review D 110, 054046 (2024)
Inspired by the BESIII newest observation of X(2370) glueball-like particle, we search its productions in both $e^+e^-$ collisions at $\sqrt{s}=$ 4.95 GeV and proton-proton (pp) collisions at $\sqrt{s}=$ 13 TeV with a parton and hadron cascade model
Externí odkaz:
http://arxiv.org/abs/2408.04130
Autor:
She, Zhi-Lei, Lei, An-Ke, Zhang, Wen-Chao, Yan, Yu-Liang, Zhou, Dai-Mei, Zheng, Hua, Sa, Ben-Hao
The parton and hadron cascade model {\footnotesize PACIAE} is employed to confirm the BESIII newest observation of glueball-like particle $\rm X(2370)$ production in $e^+e^-$ collisions at $\sqrt{s}=4.95\,\mathrm{GeV}$. We coalesce the $\rm X(2370)$
Externí odkaz:
http://arxiv.org/abs/2407.07661
Autor:
Zhang, Yiming, Xu, Xuenan, Du, Ruoyi, Liu, Haohe, Dong, Yuan, Tan, Zheng-Hua, Wang, Wenwu, Ma, Zhanyu
In traditional audio captioning methods, a model is usually trained in a fully supervised manner using a human-annotated dataset containing audio-text pairs and then evaluated on the test sets from the same dataset. Such methods have two limitations.
Externí odkaz:
http://arxiv.org/abs/2406.06295
The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems
Autor:
Gonzalez, Philippe, Tan, Zheng-Hua, Østergaard, Jan, Jensen, Jesper, Alstrøm, Tommy Sonne, May, Tobias
The performance of deep neural network-based speech enhancement systems typically increases with the training dataset size. However, studies that investigated the effect of training dataset size on speech enhancement performance did not consider rece
Externí odkaz:
http://arxiv.org/abs/2406.06160
Autor:
Yadav, Sarthak, Tan, Zheng-Hua
Despite its widespread adoption as the prominent neural architecture, the Transformer has spurred several independent lines of work to address its limitations. One such approach is selective state space models, which have demonstrated promising resul
Externí odkaz:
http://arxiv.org/abs/2406.02178