Zobrazeno 1 - 10
of 373
pro vyhledávání: '"Wang, Yisen"'
Contrastive learning has been a leading paradigm for self-supervised learning, but it is widely observed that it comes at the price of sacrificing useful features (\eg colors) by being invariant to data augmentations. Given this limitation, there has
Externí odkaz:
http://arxiv.org/abs/2411.06508
Enhancing node-level Out-Of-Distribution (OOD) generalization on graphs remains a crucial area of research. In this paper, we develop a Structural Causal Model (SCM) to theoretically dissect the performance of two prominent invariant learning methods
Externí odkaz:
http://arxiv.org/abs/2411.02847
Autor:
Fang, Lizhe, Wang, Yifei, Liu, Zhaoyang, Zhang, Chenheng, Jegelka, Stefanie, Gao, Jinyang, Ding, Bolin, Wang, Yisen
Handling long-context inputs is crucial for large language models (LLMs) in tasks such as extended conversations, document summarization, and many-shot in-context learning. While recent approaches have extended the context windows of LLMs and employe
Externí odkaz:
http://arxiv.org/abs/2410.23771
Deep learning models often suffer from a lack of interpretability due to polysemanticity, where individual neurons are activated by multiple unrelated semantics, resulting in unclear attributions of model behavior. Recent advances in monosemanticity,
Externí odkaz:
http://arxiv.org/abs/2410.21331
In this work, we explore the mechanism of in-context learning (ICL) on out-of-distribution (OOD) tasks that were not encountered during training. To achieve this, we conduct synthetic experiments where the objective is to learn OOD mathematical funct
Externí odkaz:
http://arxiv.org/abs/2410.09695
This paper studies the vulnerabilities of transformer-based Large Language Models (LLMs) to jailbreaking attacks, focusing specifically on the optimization-based Greedy Coordinate Gradient (GCG) strategy. We first observe a positive correlation betwe
Externí odkaz:
http://arxiv.org/abs/2410.09040
Skip connection is an essential ingredient for modern deep models to be deeper and more powerful. Despite their huge success in normal scenarios (state-of-the-art classification performance on natural examples), we investigate and identify an interes
Externí odkaz:
http://arxiv.org/abs/2410.08950
$q$-Breathers (QBs) represent a quintessential phenomenon of energy localization, manifesting as stable periodic orbits exponentially localized in normal mode space. Their existence can hinder the thermalization process in nonlinear lattices. In this
Externí odkaz:
http://arxiv.org/abs/2410.06575
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains thanks to spline activation functions, becoming an alternative to Multi-Layer Perceptrons (MLPs). However, spline functions may not respect symmetry in tasks, which is cr
Externí odkaz:
http://arxiv.org/abs/2410.00435
Publikováno v:
International Conference on Machine Learning 2024
Diffusion models have achieved notable success in image generation, but they remain highly vulnerable to backdoor attacks, which compromise their integrity by producing specific undesirable outputs when presented with a pre-defined trigger. In this p
Externí odkaz:
http://arxiv.org/abs/2409.05294