Zobrazeno 1 - 10
of 1 068
pro vyhledávání: '"Wang, Tianhao"'
Numerous approaches have been recently proposed for learning fair representations that mitigate unfair outcomes in prediction tasks. A key motivation for these methods is that the representations can be used by third parties with unknown objectives.
Externí odkaz:
http://arxiv.org/abs/2406.16698
Deploying a well-optimized pre-trained speaker recognition model in a new domain often leads to a significant decline in performance. While fine-tuning is a commonly employed solution, it demands ample adaptation data and suffers from parameter ineff
Externí odkaz:
http://arxiv.org/abs/2406.07832
We prove that for any generating set $S$ of $\Gamma=\mathbb {Z}^n$, the continuous edge chromatic number of the Schreier graph of the Bernoulli shift action $G=F(S,2^\Gamma)$ is $\chi'_c(G)=\chi'(G)+1$. In particular, for the standard generating set,
Externí odkaz:
http://arxiv.org/abs/2406.02825
The fusion of raw features from multiple sensors on an autonomous vehicle to create a Bird's Eye View (BEV) representation is crucial for planning and control systems. There is growing interest in using deep learning models for BEV semantic segmentat
Externí odkaz:
http://arxiv.org/abs/2405.20986
Reinforcement learning (RL) trains an agent from experiences interacting with the environment. In scenarios where online interactions are impractical, offline RL, which trains the agent using pre-collected datasets, has become popular. While this new
Externí odkaz:
http://arxiv.org/abs/2404.12530
Soft errors in memories and logic circuits are known to disturb program execution. In this context, the research community has been proposing a plethora of fault-tolerance (FT) solutions over the last decades, as well as fault-injection (FI) approach
Externí odkaz:
http://arxiv.org/abs/2403.20319
The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as
Externí odkaz:
http://arxiv.org/abs/2403.09562
We study gradient flow on the exponential loss for a classification problem with a one-layer softmax attention model, where the key and query weight matrices are trained separately. Under a separability assumption on the data, we show that when gradi
Externí odkaz:
http://arxiv.org/abs/2403.08699
Transformer-based models have demonstrated remarkable in-context learning capabilities, prompting extensive research into its underlying mechanisms. Recent studies have suggested that Transformers can implement first-order optimization algorithms for
Externí odkaz:
http://arxiv.org/abs/2403.03183
We study the dynamics of gradient flow for training a multi-head softmax attention model for in-context learning of multi-task linear regression. We establish the global convergence of gradient flow under suitable choices of initialization. In additi
Externí odkaz:
http://arxiv.org/abs/2402.19442