Zobrazeno 1 - 10
of 19
pro vyhledávání: '"Feng, Kaituo"'
Autor:
Gong, Kaixiong, Feng, Kaituo, Li, Bohao, Wang, Yibing, Cheng, Mofan, Yang, Shijia, Han, Jiaming, Wang, Benyou, Bai, Yutong, Yang, Zhuoran, Yue, Xiangyu
Recently, multimodal large language models (MLLMs), such as GPT-4o, Gemini 1.5 Pro, and Reka Core, have expanded their capabilities to include vision and audio modalities. While these models demonstrate impressive performance across a wide range of a
Externí odkaz:
http://arxiv.org/abs/2412.02611
Low-rank training has emerged as a promising approach for reducing memory usage in training Large Language Models (LLMs). Previous methods either rely on decomposing weight matrices (e.g., LoRA), or seek to decompose gradient matrices (e.g., GaLore)
Externí odkaz:
http://arxiv.org/abs/2410.01623
Chain-of-thought distillation is a powerful technique for transferring reasoning abilities from large language models (LLMs) to smaller student models. Previous methods typically require the student to mimic the step-by-step rationale produced by LLM
Externí odkaz:
http://arxiv.org/abs/2405.16064
End-to-end motion planning models equipped with deep neural networks have shown great potential for enabling full autonomous driving. However, the oversized neural networks render them impractical for deployment on resource-constrained systems, which
Externí odkaz:
http://arxiv.org/abs/2403.01238
Typical Convolutional Neural Networks (ConvNets) depend heavily on large amounts of image data and resort to an iterative optimization algorithm (e.g., SGD or Adam) to learn network parameters, which makes training very time- and resource-intensive.
Externí odkaz:
http://arxiv.org/abs/2310.11862
Knowledge distillation (KD) has shown to be effective to boost the performance of graph neural networks (GNNs), where the typical objective is to distill knowledge from a deeper teacher GNN into a shallower student GNN. However, it is often quite cha
Externí odkaz:
http://arxiv.org/abs/2307.00534
Graph neural networks (GNNs) for temporal graphs have recently attracted increasing attentions, where a common assumption is that the class set for nodes is closed. However, in real-world scenarios, it often faces the open set problem with the dynami
Externí odkaz:
http://arxiv.org/abs/2303.15015
Graph structured data often possess dynamic characters in nature. Recent years have witnessed the increasing attentions paid to dynamic graph neural networks for modelling graph data. However, almost all existing approaches operate under the assumpti
Externí odkaz:
http://arxiv.org/abs/2207.10839
Knowledge distillation (KD) has demonstrated its effectiveness to boost the performance of graph neural networks (GNNs), where its goal is to distill knowledge from a deeper teacher GNN into a shallower student GNN. However, it is actually difficult
Externí odkaz:
http://arxiv.org/abs/2206.06561
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.