Výsledky vyhledávání - "Zhang, Wentao"

Report

Data Proportion Detection for Optimized Data Management for Large Language Models

Autor: Liang, Hao, Zhao, Keshi, Yang, Yajie, Cui, Bin, Dong, Guosheng, Zhou, Zenan, Zhang, Wentao

Large language models (LLMs) have demonstrated exceptional performance across a wide range of tasks and domains, with data preparation playing a critical role in achieving these results. Pre-training data typically combines information from multiple

Externí odkaz: http://arxiv.org/abs/2409.17527

Zobrazit plný text záznamu

Report

BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search

Autor: Sun, Linzhuang, Liang, Hao, Zhang, Wentao

Large Language Models (LLMs) have exhibited exceptional performance across a broad range of tasks and domains. However, they still encounter difficulties in solving mathematical problems due to the rigorous and logical nature of mathematics. Previous

Externí odkaz: http://arxiv.org/abs/2409.17972

Zobrazit plný text záznamu

Report

Retrofitting Temporal Graph Neural Networks with Transformer

Autor: Huang, Qiang, Yan, Xiao, Wang, Xin, Rao, Susie Xi, Han, Zhichao, Fu, Fangcheng, Zhang, Wentao, Jiang, Jiawei

Temporal graph neural networks (TGNNs) outperform regular GNNs by incorporating time information into graph-based operations. However, TGNNs adopt specialized models (e.g., TGN, TGAT, and APAN ) and require tailored training frameworks (e.g., TGL and

Externí odkaz: http://arxiv.org/abs/2409.05477

Zobrazit plný text záznamu

Report

DataSculpt: Crafting Data Landscapes for LLM Post-Training through Multi-objective Partitioning

Autor: Lu, Keer, Liang, Zheng, Nie, Xiaonan, Pan, Da, Zhang, Shusen, Zhao, Keshi, Chen, Weipeng, Zhou, Zenan, Dong, Guosheng, Zhang, Wentao, Cui, Bin

The effectiveness of long-context modeling is important for Large Language Models (LLMs) in various applications. Despite their potential, LLMs' efficacy in processing long context does not consistently meet expectations, posing significant challenge

Externí odkaz: http://arxiv.org/abs/2409.00997

Zobrazit plný text záznamu

Report

OpenFGL: A Comprehensive Benchmarks for Federated Graph Learning

Autor: Li, Xunkai, Zhu, Yinlin, Pang, Boyang, Yan, Guochen, Yan, Yeyu, Li, Zening, Wu, Zhengyu, Zhang, Wentao, Li, Rong-Hua, Wang, Guoren

Federated graph learning (FGL) has emerged as a promising distributed training paradigm for graph neural networks across multiple local systems without direct data sharing. This approach is particularly beneficial in privacy-sensitive scenarios and o

Externí odkaz: http://arxiv.org/abs/2408.16288

Zobrazit plný text záznamu

Report

BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

The general capabilities of Large Language Models (LLM) highly rely on the composition and selection on extensive pretraining datasets, treated as commercial secrets by several institutions. To mitigate this issue, we open-source the details of a uni

Externí odkaz: http://arxiv.org/abs/2408.15079

Zobrazit plný text záznamu

Report

Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning

The rapid progress in Deep Learning (DL) and Large Language Models (LLMs) has exponentially increased demands of computational power and bandwidth. This, combined with the high costs of faster computing chips and interconnects, has significantly infl

Externí odkaz: http://arxiv.org/abs/2408.14158

Zobrazit plný text záznamu

Report

RAG-Optimized Tibetan Tourism LLMs: Enhancing Accuracy and Personalization

Autor: Qi, Jinhu, Yan, Shuai, Zhang, Yibo, Zhang, Wentao, Jin, Rong, Hu, Yuwei, Wang, Ke

With the development of the modern social economy, tourism has become an important way to meet people's spiritual needs, bringing development opportunities to the tourism industry. However, existing large language models (LLMs) face challenges in per

Externí odkaz: http://arxiv.org/abs/2408.12003

Zobrazit plný text záznamu

Report

SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs

Autor: Yin, Yuanyang, Zhao, Yaqi, Zhang, Yajie, Lin, Ke, Wang, Jiahao, Tao, Xin, Wan, Pengfei, Zhang, Di, Yin, Baoqun, Zhang, Wentao

Multimodal Large Language Models (MLLMs) have recently demonstrated remarkable perceptual and reasoning abilities, typically comprising a Vision Encoder, an Adapter, and a Large Language Model (LLM). The adapter serves as the critical bridge between

Externí odkaz: http://arxiv.org/abs/2408.11813

Zobrazit plný text záznamu

Report

SysBench: Can Large Language Models Follow System Messages?

Autor: Qin, Yanzhao, Zhang, Tao, Shen, Yanjun, Luo, Wenjing, Sun, Haoze, Zhang, Yan, Qiao, Yujing, Chen, Weipeng, Zhou, Zenan, Zhang, Wentao, Cui, Bin

Large Language Models (LLMs) have become instrumental across various applications, with the customization of these models to specific scenarios becoming increasingly critical. System message, a fundamental component of LLMs, is consist of carefully c

Externí odkaz: http://arxiv.org/abs/2408.10943

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání