Zobrazeno 1 - 10
of 135
pro vyhledávání: '"Wu, Yongtao"'
Adversarial attacks in Natural Language Processing apply perturbations in the character or token levels. Token-level attacks, gaining prominence for their use of gradient-based methods, are susceptible to altering sentence semantics, leading to inval
Externí odkaz:
http://arxiv.org/abs/2405.04346
Recent developments in neural architecture search (NAS) emphasize the significance of considering robust architectures against malicious data. However, there is a notable absence of benchmark evaluations and theoretical guarantees for searching these
Externí odkaz:
http://arxiv.org/abs/2403.13134
We develop universal gradient methods for Stochastic Convex Optimization (SCO). Our algorithms automatically adapt not only to the oracle's noise but also to the H\"older smoothness of the objective function without a priori knowledge of the particul
Externí odkaz:
http://arxiv.org/abs/2402.03210
In this paper, we aim to build the global convergence theory of encoder-only shallow Transformers under a realistic setting from the perspective of architectures, initialization, and scaling under a finite width regime. The difficulty lies in how to
Externí odkaz:
http://arxiv.org/abs/2311.01575
Autor:
Wu, Yongtao
Polynomial neural networks (NNs-Hp) have recently demonstrated high expressivity and efficiency across several tasks. However, a theoretical explanation toward such success is still unclear, especially when compared to the classical neural networks.
Externí odkaz:
http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-318819
Neural tangent kernel (NTK) is a powerful tool to analyze training dynamics of neural networks and their generalization bounds. The study on NTK has been devoted to typical neural network architectures, but it is incomplete for neural networks with H
Externí odkaz:
http://arxiv.org/abs/2209.07736
Publikováno v:
Journal of Orthopaedic Surgery & Research. 6/18/2024, Vol. 19 Issue 1, p1-13. 13p.
Time-frequency (TF) representations in audio synthesis have been increasingly modeled with real-valued networks. However, overlooking the complex-valued nature of TF representations can result in suboptimal performance and require additional modules
Externí odkaz:
http://arxiv.org/abs/2206.06811
Publikováno v:
In Journal of Surgical Research June 2024 298:63-70
Autor:
Liu, Zhen1,2 (AUTHOR), Wu, Yongtao1,2 (AUTHOR), Liao, Jin1,2 (AUTHOR), Li, Dexian1,2 (AUTHOR), Zhou, Cuiying1,2 (AUTHOR) zhoucy@mail.sysu.edu.cn
Publikováno v:
PLoS ONE. 4/3/2024, Vol. 19 Issue 4, p1-27. 27p.