Zobrazeno 1 - 10
of 18 820
pro vyhledávání: '"TAN, M"'
A new analysis of a sample of visual light curves of Mira variables is presented. The curves cover the past four decades and are selected from the AAVSO database as including a very large number of high-density and high-quality observations. The aim
Externí odkaz:
http://arxiv.org/abs/2411.18044
Autor:
Nielsen, Stefan K., Nguyen, Tan M.
Contrastive learning has proven instrumental in learning unbiased representations of data, especially in complex environments characterized by high-cardinality and high-dimensional sensitive information. However, existing approaches within this setti
Externí odkaz:
http://arxiv.org/abs/2411.14765
Autor:
Teo, Rachel S. Y., Nguyen, Tan M.
Sparse Mixture of Experts (SMoE) has become the key to unlocking unparalleled scalability in deep learning. SMoE has the potential to exponentially increase parameter count while maintaining the efficiency of the model by only activating a small subs
Externí odkaz:
http://arxiv.org/abs/2410.14574
Neural functional networks (NFNs) have recently gained significant attention due to their diverse applications, ranging from predicting network generalization and network editing to classifying implicit neural representation. Previous NFN designs oft
Externí odkaz:
http://arxiv.org/abs/2409.11697
Autor:
Nguyen, Tan M., Nguyen, Tam, Ho, Nhat, Bertozzi, Andrea L., Baraniuk, Richard G., Osher, Stanley J.
Self-attention is key to the remarkable success of transformers in sequence modeling tasks including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by h
Externí odkaz:
http://arxiv.org/abs/2406.13781
Pairwise dot-product self-attention is key to the success of transformers that achieve state-of-the-art performance across a variety of applications in language and vision. This dot-product self-attention computes attention weights among the input to
Externí odkaz:
http://arxiv.org/abs/2406.13770
Autor:
Teo, Rachel S. Y., Nguyen, Tan M.
The remarkable success of transformers in sequence modeling tasks, spanning various applications in natural language processing and computer vision, is attributed to the critical role of self-attention. Similar to the development of most deep learnin
Externí odkaz:
http://arxiv.org/abs/2406.13762
Sliced Wasserstein (SW) distance in Optimal Transport (OT) is widely used in various applications thanks to its statistical effectiveness and computational efficiency. On the other hand, Tree Wassenstein (TW) and Tree-sliced Wassenstein (TSW) are ins
Externí odkaz:
http://arxiv.org/abs/2406.13725
In this work, we address two main shortcomings of transformer architectures: input corruption and rank collapse in their output representation. We unveil self-attention as an autonomous state-space model that inherently promotes smoothness in its sol
Externí odkaz:
http://arxiv.org/abs/2402.15989
Transformers have achieved remarkable success in a wide range of natural language processing and computer vision applications. However, the representation capacity of a deep transformer model is degraded due to the over-smoothing issue in which the t
Externí odkaz:
http://arxiv.org/abs/2312.00751