Výsledky vyhledávání

Report

VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

Autor: Li, Lei, Wei, Yuancheng, Xie, Zhihui, Yang, Xuqing, Song, Yifan, Wang, Peiyi, An, Chenxin, Liu, Tianyu, Li, Sujian, Lin, Bill Yuchen, Kong, Lingpeng, Liu, Qi

Vision-language generative reward models (VL-GenRMs) play a crucial role in aligning and evaluating multimodal AI systems, yet their own evaluation remains under-explored. Current assessment methods primarily rely on AI-annotated preference labels fr

Externí odkaz: http://arxiv.org/abs/2411.17451

Zobrazit plný text záznamu

Report

AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories

Autor: Song, Yifan, Xiong, Weimin, Zhao, Xiutian, Zhu, Dawei, Wu, Wenhao, Wang, Ke, Li, Cheng, Peng, Wei, Li, Sujian

Fine-tuning on agent-environment interaction trajectory data holds significant promise for surfacing generalized agent capabilities in open-source large language models (LLMs). In this work, we introduce AgentBank, by far the largest trajectory tunin

Externí odkaz: http://arxiv.org/abs/2410.07706

Zobrazit plný text záznamu

Report

Shapley Value-based Contrastive Alignment for Multimodal Information Extraction

Autor: Luo, Wen, Xia, Yu, Tianshu, Shen, Li, Sujian

The rise of social media and the exponential growth of multimodal communication necessitates advanced techniques for Multimodal Information Extraction (MIE). However, existing methodologies primarily rely on direct Image-Text interactions, a paradigm

Externí odkaz: http://arxiv.org/abs/2407.17854

Zobrazit plný text záznamu

Report

When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments

Autor: Zhang, Chong, Liu, Xinyi, Zhang, Zhongmou, Jin, Mingyu, Li, Lingyao, Wang, Zhenting, Hua, Wenyue, Shu, Dong, Zhu, Suiyuan, Jin, Xiaobo, Li, Sujian, Du, Mengnan, Zhang, Yongfeng

Can AI Agents simulate real-world trading environments to investigate the impact of external factors on stock trading activities (e.g., macroeconomics, policy changes, company fundamentals, and global events)? These factors, which frequently influenc

Externí odkaz: http://arxiv.org/abs/2407.18957

Zobrazit plný text záznamu

Report

The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism

Autor: Song, Yifan, Wang, Guoyin, Li, Sujian, Lin, Bill Yuchen

Current evaluations of large language models (LLMs) often overlook non-determinism, typically focusing on a single output per example. This limits our understanding of LLM performance variability in real-world applications. Our study addresses this i

Externí odkaz: http://arxiv.org/abs/2407.10457

Zobrazit plný text záznamu

Report

EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection

Autor: Li, Zheng, Zhu, Dawei, Ma, Qilong, Xiong, Weimin, Li, Sujian

Personality is a fundamental construct in psychology, reflecting an individual's behavior, thinking, and emotional patterns. Previous researches have made some progress in personality detection, primarily by utilizing the whole text to predict person

Externí odkaz: http://arxiv.org/abs/2406.16079

Zobrazit plný text záznamu

Report

Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement

Autor: Xiong, Weimin, Song, Yifan, Zhao, Xiutian, Wu, Wenhao, Wang, Xun, Wang, Ke, Li, Cheng, Peng, Wei, Li, Sujian

Large language model agents have exhibited exceptional performance across a range of complex interactive tasks. Recent approaches have utilized tuning with expert trajectories to enhance agent performance, yet they primarily concentrate on outcome re

Externí odkaz: http://arxiv.org/abs/2406.11176

Zobrazit plný text záznamu

Report

Long Context Alignment with Short Instructions and Synthesized Positions

Autor: Wu, Wenhao, Wang, Yizhong, Fu, Yao, Yue, Xiang, Zhu, Dawei, Li, Sujian

Effectively handling instructions with extremely long context remains a challenge for Large Language Models (LLMs), typically necessitating high-quality long data and substantial computational resources. This paper introduces Step-Skipping Alignment

Externí odkaz: http://arxiv.org/abs/2405.03939

Zobrazit plný text záznamu

Report

LongEmbed: Extending Embedding Models for Long Context Retrieval

Autor: Zhu, Dawei, Wang, Liang, Yang, Nan, Song, Yifan, Wu, Wenhao, Wei, Furu, Li, Sujian

Embedding models play a pivot role in modern NLP applications such as IR and RAG. While the context limit of LLMs has been pushed beyond 1 million tokens, embedding models are still confined to a narrow context window not exceeding 8k tokens, refrain

Externí odkaz: http://arxiv.org/abs/2404.12096

Zobrazit plný text záznamu

Report

CoUDA: Coherence Evaluation via Unified Data Augmentation

Autor: Zhu, Dawei, Wu, Wenhao, Song, Yifan, Zhu, Fangwei, Cao, Ziqiang, Li, Sujian

Coherence evaluation aims to assess the organization and structure of a discourse, which remains challenging even in the era of large language models. Due to the scarcity of annotated data, data augmentation is commonly used for training coherence ev

Externí odkaz: http://arxiv.org/abs/2404.00681

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání