Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Feng, Huawen"'
Unsupervised sentence embeddings task aims to convert sentences to semantic vector representations. Most previous works directly use the sentence representations derived from pretrained language models. However, due to the token bias in pretrained la
Externí odkaz:
http://arxiv.org/abs/2402.15153
Class-Incremental Learning (CIL) is a practical and challenging problem for achieving general artificial intelligence. Recently, Pre-Trained Models (PTMs) have led to breakthroughs in both visual and natural language processing tasks. Despite recent
Externí odkaz:
http://arxiv.org/abs/2402.10063
Multimodal Continual Instruction Tuning (MCIT) enables Multimodal Large Language Models (MLLMs) to meet continuously emerging requirements without expensive retraining. MCIT faces two major obstacles: catastrophic forgetting (where old knowledge is f
Externí odkaz:
http://arxiv.org/abs/2401.09181
Autor:
Feng, Huawen, Fan, Yan, Liu, Xiong, Lin, Ting-En, Yao, Zekun, Wu, Yuchuan, Huang, Fei, Li, Yongbin, Ma, Qianli
Despite the recent progress in text summarization made by large language models (LLMs), they often generate summaries that are factually inconsistent with original articles, known as "hallucinations" in text generation. Unlike previous small models (
Externí odkaz:
http://arxiv.org/abs/2310.19347
Autor:
Zheng, Junhao, Ma, Qianli, Qiu, Shengjie, Wu, Yue, Ma, Peitian, Liu, Junlong, Feng, Huawen, Shang, Xichen, Chen, Haibin
Fine-tuning has been proven to be a simple and effective technique to transfer the learned knowledge of Pre-trained Language Models (PLMs) to downstream tasks. However, vanilla fine-tuning easily overfits the target data and degrades the generalizati
Externí odkaz:
http://arxiv.org/abs/2306.10790
In text classification, the traditional attention mechanisms usually focus too much on frequent words, and need extensive labeled data in order to learn. This paper proposes a perturbation-based self-supervised attention approach to guide attention l
Externí odkaz:
http://arxiv.org/abs/2305.15684
Publikováno v:
Abstract & Applied Analysis; 2013, p1-4, 4p