Výsledky vyhledávání

Report

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Autor: Zhang, Jiajie, Bai, Yushi, Lv, Xin, Gu, Wanjun, Liu, Danqing, Zou, Minhao, Cao, Shulin, Hou, Lei, Dong, Yuxiao, Feng, Ling, Li, Juanzi

Though current long-context large language models (LLMs) have demonstrated impressive capacities in answering user questions based on extensive text, the lack of citations in their responses makes user verification difficult, leading to concerns abou

Externí odkaz: http://arxiv.org/abs/2409.02897

Zobrazit plný text záznamu

Report

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Autor: Bai, Yushi, Zhang, Jiajie, Lv, Xin, Zheng, Linzhi, Zhu, Siqi, Hou, Lei, Dong, Yuxiao, Tang, Jie, Li, Juanzi

Current long context large language models (LLMs) can process inputs up to 100,000 tokens, yet struggle to generate outputs exceeding even a modest length of 2,000 words. Through controlled experiments, we find that the model's effective generation l

Externí odkaz: http://arxiv.org/abs/2408.07055

Zobrazit plný text záznamu

Report

Finding Safety Neurons in Large Language Models

Autor: Chen, Jianhui, Wang, Xiaozhi, Yao, Zijun, Bai, Yushi, Hou, Lei, Li, Juanzi

Large language models (LLMs) excel in various capabilities but also pose safety risks such as generating harmful content and misinformation, even after safety alignment. In this paper, we explore the inner mechanisms of safety alignment from the pers

Externí odkaz: http://arxiv.org/abs/2406.14144

Zobrazit plný text záznamu

Report

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable model

Externí odkaz: http://arxiv.org/abs/2406.12793

Zobrazit plný text záznamu

Report

DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning

Autor: Tu, Shangqing, Zhu, Kejian, Bai, Yushi, Yao, Zijun, Hou, Lei, Li, Juanzi

The advancement of large language models (LLMs) relies on evaluation using public benchmarks, but data contamination can lead to overestimated performance. Previous researches focus on detecting contamination by determining whether the model has seen

Externí odkaz: http://arxiv.org/abs/2406.04197

Zobrazit plný text záznamu

Report

MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained Classification

Autor: Sun, Kai, Bai, Yushi, Qi, Ji, Hou, Lei, Li, Juanzi

To advance the evaluation of multimodal math reasoning in large multimodal models (LMMs), this paper introduces a novel benchmark, MM-MATH. MM-MATH consists of 5,929 open-ended middle school math problems with visual contexts, with fine-grained class

Externí odkaz: http://arxiv.org/abs/2404.05091

Zobrazit plný text záznamu

Report

Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models

Autor: Ying, Jiahao, Cao, Yixin, Bai, Yushi, Sun, Qianru, Wang, Bo, Tang, Wei, Ding, Zhaojun, Yang, Yizhe, Huang, Xuanjing, Yan, Shuicheng

Large language models (LLMs) have achieved impressive performance across various natural language benchmarks, prompting a continual need to curate more difficult datasets for larger LLMs, which is costly and time-consuming. In this paper, we propose

Externí odkaz: http://arxiv.org/abs/2402.11894

Zobrazit plný text záznamu

Report

CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations

Autor: Qi, Ji, Ding, Ming, Wang, Weihan, Bai, Yushi, Lv, Qingsong, Hong, Wenyi, Xu, Bin, Hou, Lei, Li, Juanzi, Dong, Yuxiao, Tang, Jie

Vision-Language Models (VLMs) have demonstrated their broad effectiveness thanks to extensive training in aligning visual instructions to responses. However, such training of conclusive alignment leads models to ignore essential visual reasoning, fur

Externí odkaz: http://arxiv.org/abs/2402.04236

Zobrazit plný text záznamu

Report

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Autor: Bai, Yushi, Lv, Xin, Zhang, Jiajie, He, Yuze, Qi, Ji, Hou, Lei, Tang, Jie, Dong, Yuxiao, Li, Juanzi

Extending large language models to effectively handle long contexts requires instruction fine-tuning on input sequences of similar length. To address this, we present LongAlign -- a recipe of the instruction data, training, and evaluation for long co

Externí odkaz: http://arxiv.org/abs/2401.18058

Zobrazit plný text záznamu

Report

Text-Image Conditioned Diffusion for Consistent Text-to-3D Generation

Autor: He, Yuze, Bai, Yushi, Lin, Matthieu, Sheng, Jenny, Hu, Yubin, Wang, Qi, Wen, Yu-Hui, Liu, Yong-Jin

By lifting the pre-trained 2D diffusion models into Neural Radiance Fields (NeRFs), text-to-3D generation methods have made great progress. Many state-of-the-art approaches usually apply score distillation sampling (SDS) to optimize the NeRF represen

Externí odkaz: http://arxiv.org/abs/2312.11774

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání