Zobrazeno 1 - 10
of 54
pro vyhledávání: '"Fan, Xinjie"'
The neural attention mechanism has been incorporated into deep neural networks to achieve state-of-the-art performance in various domains. Most such models use multi-head self-attention which is appealing for the ability to attend to information from
Externí odkaz:
http://arxiv.org/abs/2110.12567
Autor:
Tanwisuth, Korawat, Fan, Xinjie, Zheng, Huangjie, Zhang, Shujian, Zhang, Hao, Chen, Bo, Zhou, Mingyuan
Existing methods for unsupervised domain adaptation often rely on minimizing some statistical distance between the source and target samples in the latent space. To avoid the sampling variability, class imbalance, and data-privacy concerns that often
Externí odkaz:
http://arxiv.org/abs/2110.12024
Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks. Most such models use deterministic attention while stochastic attention is less explored due to the optimization difficulties or complicated model design
Externí odkaz:
http://arxiv.org/abs/2106.05251
Single domain generalization aims to learn a model that performs well on many unseen domains with only one domain data for training. Existing works focus on studying the adversarial domain augmentation (ADA) to improve the model's generalization capa
Externí odkaz:
http://arxiv.org/abs/2106.01899
Publikováno v:
ICLR 2021
Dropout has been demonstrated as a simple and effective module to not only regularize the training process of deep neural networks, but also provide the uncertainty estimation for prediction. However, the quality of uncertainty estimation is highly d
Externí odkaz:
http://arxiv.org/abs/2103.04181
Attention modules, as simple and effective tools, have not only enabled deep neural networks to achieve state-of-the-art results in many domains, but also enhanced their interpretability. Most current models use deterministic attention modules due to
Externí odkaz:
http://arxiv.org/abs/2010.10604
Models based on the Transformer architecture have achieved better accuracy than the ones based on competing architectures for a large set of tasks. A unique feature of the Transformer is its universal application of a self-attention mechanism, which
Externí odkaz:
http://arxiv.org/abs/2009.14308
Sequence generation models are commonly refined with reinforcement learning over user-defined metrics. However, high gradient variance hinders the practical use of this method. To stabilize this method, we adapt to contextual generation of categorica
Externí odkaz:
http://arxiv.org/abs/1912.13151
Selecting hyperparameters for unsupervised learning problems is challenging in general due to the lack of ground truth for validation. Despite the prevalence of this issue in statistics and machine learning, especially in clustering problems, there a
Externí odkaz:
http://arxiv.org/abs/1910.08018
Autor:
Huang, Fengjuan1 (AUTHOR), Fan, Xinjie2 (AUTHOR), Wang, Ying2 (AUTHOR), Zou, Yu3 (AUTHOR), Lian, Jiangfang1 (AUTHOR), Wang, Chuang4 (AUTHOR) wangchuang@nbu.edu.cn, Ding, Feng5 (AUTHOR) wangchuang@nbu.edu.cn, Sun, Yunxiang2,5 (AUTHOR) wangchuang@nbu.edu.cn
Publikováno v:
Briefings in Bioinformatics. Mar2024, Vol. 25 Issue 2, p1-15. 15p.