Zobrazeno 1 - 10
of 369
pro vyhledávání: '"Oh, Jaehoon"'
Massive activations, which manifest in specific feature dimensions of hidden states, introduce a significant bias in large language models (LLMs), leading to an overemphasis on the corresponding token. In this paper, we identify that massive activati
Externí odkaz:
http://arxiv.org/abs/2410.01866
Autor:
Lee, Gihun, Jeong, Minchan, Kim, Yujin, Jung, Hojung, Oh, Jaehoon, Kim, Sangmook, Yun, Se-Young
While learning to align Large Language Models (LLMs) with human preferences has shown remarkable success, aligning these models to meet the diverse user preferences presents further challenges in preserving previous knowledge. This paper examines the
Externí odkaz:
http://arxiv.org/abs/2407.00693
Federated Learning (FL) is a collaborative method for training models while preserving data privacy in decentralized settings. However, FL encounters challenges related to data heterogeneity, which can result in performance degradation. In our study,
Externí odkaz:
http://arxiv.org/abs/2311.13267
Contrastive language-image pre-training (CLIP) has demonstrated remarkable zero-shot classification ability, namely image classification using novel text labels. Existing works have attempted to enhance CLIP by fine-tuning on downstream tasks, but th
Externí odkaz:
http://arxiv.org/abs/2308.15273
Federated Learning (FL) aggregates locally trained models from individual clients to construct a global model. While FL enables learning a model with data privacy, it often suffers from significant performance degradation when clients have heterogene
Externí odkaz:
http://arxiv.org/abs/2308.12532
Translation has played a crucial role in improving the performance on multilingual tasks: (1) to generate the target language data from the source language data for training and (2) to generate the source language data from the target language data f
Externí odkaz:
http://arxiv.org/abs/2210.09588
Autor:
Oh, Jaehoon, Yun, Se-Young
Few-shot class-incremental learning (FSCIL) has addressed challenging real-world scenarios where unseen novel classes continually arrive with few samples. In these scenarios, it is required to develop a model that recognizes the novel classes without
Externí odkaz:
http://arxiv.org/abs/2206.10596
Most of the recent few-shot learning (FSL) algorithms are based on transfer learning, where a model is pre-trained using a large amount of source data, and the pre-trained model is fine-tuned using a small amount of target data. In transfer learning-
Externí odkaz:
http://arxiv.org/abs/2205.07874
Cross-domain few-shot learning (CD-FSL), where there are few target samples under extreme differences between source and target domains, has recently attracted huge attention. Recent studies on CD-FSL generally focus on transfer learning based approa
Externí odkaz:
http://arxiv.org/abs/2205.05282
Cross-domain few-shot learning (CD-FSL) has drawn increasing attention for handling large differences between the source and target domains--an important concern in real-world scenarios. To overcome these large differences, recent works have consider
Externí odkaz:
http://arxiv.org/abs/2202.01339