Výsledky vyhledávání

Report

LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models

Autor: Gui, Jiayi, Liu, Yiming, Cheng, Jiale, Gu, Xiaotao, Liu, Xiao, Wang, Hongning, Dong, Yuxiao, Tang, Jie, Huang, Minlie

Large Language Models (LLMs) have demonstrated notable capabilities across various tasks, showcasing complex problem-solving abilities. Understanding and executing complex rules, along with multi-step planning, are fundamental to logical reasoning an

Externí odkaz: http://arxiv.org/abs/2408.15778

Zobrazit plný text záznamu

Report

VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents

Large Multimodal Models (LMMs) have ushered in a new era in artificial intelligence, merging capabilities in both language and vision to form highly capable Visual Foundation Agents. These agents are postulated to excel across a myriad of tasks, pote

Externí odkaz: http://arxiv.org/abs/2408.06327

Zobrazit plný text záznamu

Report

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Autor: Yang, Zhuoyi, Teng, Jiayan, Zheng, Wendi, Ding, Ming, Huang, Shiyu, Xu, Jiazheng, Yang, Yuanming, Hong, Wenyi, Zhang, Xiaohan, Feng, Guanyu, Yin, Da, Gu, Xiaotao, Zhang, Yuxuan, Wang, Weihan, Cheng, Yean, Liu, Ting, Xu, Bin, Dong, Yuxiao, Tang, Jie

We introduce CogVideoX, a large-scale diffusion transformer model designed for generating videos based on text prompts. To efficently model video data, we propose to levearge a 3D Variational Autoencoder (VAE) to compress videos along both spatial an

Externí odkaz: http://arxiv.org/abs/2408.06072

Zobrazit plný text záznamu

Report

Benchmarking Complex Instruction-Following with Multiple Constraints Composition

Autor: Wen, Bosi, Ke, Pei, Gu, Xiaotao, Wu, Lindong, Huang, Hao, Zhou, Jinfeng, Li, Wenchuang, Hu, Binxin, Gao, Wendy, Xu, Jiaxin, Liu, Yiming, Tang, Jie, Wang, Hongning, Huang, Minlie

Instruction following is one of the fundamental capabilities of large language models (LLMs). As the ability of LLMs is constantly improving, they have been increasingly applied to deal with complex human instructions in real-world scenarios. Therefo

Externí odkaz: http://arxiv.org/abs/2407.03978

Zobrazit plný text záznamu

Report

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Autor: Cheng, Jiale, Lu, Yida, Gu, Xiaotao, Ke, Pei, Liu, Xiao, Dong, Yuxiao, Wang, Hongning, Tang, Jie, Huang, Minlie

Although Large Language Models (LLMs) are becoming increasingly powerful, they still exhibit significant but subtle weaknesses, such as mistakes in instruction-following or coding tasks. As these unexpected errors could lead to severe consequences in

Externí odkaz: http://arxiv.org/abs/2406.16714

Zobrazit plný text záznamu

Report

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable model

Externí odkaz: http://arxiv.org/abs/2406.12793

Zobrazit plný text záznamu

Report

NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts

Autor: Zhang, Shudan, Zhao, Hanlin, Liu, Xiao, Zheng, Qinkai, Qi, Zehan, Gu, Xiaotao, Zhang, Xiaohan, Dong, Yuxiao, Tang, Jie

Large language models (LLMs) have manifested strong ability to generate codes for productive activities. However, current benchmarks for code synthesis, such as HumanEval, MBPP, and DS-1000, are predominantly oriented towards introductory tasks on al

Externí odkaz: http://arxiv.org/abs/2405.04520

Zobrazit plný text záznamu

Report

CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion

Autor: Zheng, Wendi, Teng, Jiayan, Yang, Zhuoyi, Wang, Weihan, Chen, Jidong, Gu, Xiaotao, Dong, Yuxiao, Ding, Ming, Tang, Jie

Recent advancements in text-to-image generative systems have been largely driven by diffusion models. However, single-stage text-to-image diffusion models still face challenges, in terms of computational efficiency and the refinement of image details

Externí odkaz: http://arxiv.org/abs/2403.05121

Zobrazit plný text záznamu

Report

AlignBench: Benchmarking Chinese Alignment of Large Language Models

Autor: Liu, Xiao, Lei, Xuanyu, Wang, Shengyuan, Huang, Yue, Feng, Zhuoer, Wen, Bosi, Cheng, Jiale, Ke, Pei, Xu, Yifan, Tam, Weng Lam, Zhang, Xiaohan, Sun, Lichao, Gu, Xiaotao, Wang, Hongning, Zhang, Jing, Huang, Minlie, Dong, Yuxiao, Tang, Jie

Alignment has become a critical step for instruction-tuned Large Language Models (LLMs) to become helpful assistants. However, the effective evaluation of alignment for emerging Chinese LLMs is still largely unexplored. To fill in this gap, we introd

Externí odkaz: http://arxiv.org/abs/2311.18743

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání