Výsledky vyhledávání

Report

Simulating Classroom Education with LLM-Empowered Agents

Autor: Zhang, Zheyuan, Zhang-Li, Daniel, Yu, Jifan, Gong, Linlu, Zhou, Jinchang, Liu, Zhiyuan, Hou, Lei, Li, Juanzi

Large language models (LLMs) have been employed in various intelligent educational tasks to assist teaching. While preliminary explorations have focused on independent LLM-empowered agents for specific educational tasks, the potential for LLMs within

Externí odkaz: http://arxiv.org/abs/2406.19226

Zobrazit plný text záznamu

Report

SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation

Autor: Ma, Zeyao, Zhang, Bohan, Zhang, Jing, Yu, Jifan, Zhang, Xiaokang, Zhang, Xiaohan, Luo, Sijia, Wang, Xi, Tang, Jie

We introduce SpreadsheetBench, a challenging spreadsheet manipulation benchmark exclusively derived from real-world scenarios, designed to immerse current large language models (LLMs) in the actual workflow of spreadsheet users. Unlike existing bench

Externí odkaz: http://arxiv.org/abs/2406.14991

Zobrazit plný text záznamu

Report

Knowledge-to-Jailbreak: One Knowledge Point Worth One Attack

Autor: Tu, Shangqing, Pan, Zhuoran, Wang, Wenxuan, Zhang, Zhexin, Sun, Yuliang, Yu, Jifan, Wang, Hongning, Hou, Lei, Li, Juanzi

Large language models (LLMs) have been increasingly applied to various domains, which triggers increasing concerns about LLMs' safety on specialized domains, e.g. medicine. However, testing the domain-specific safety of LLMs is challenging due to the

Externí odkaz: http://arxiv.org/abs/2406.11682

Zobrazit plný text záznamu

Report

R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models

Autor: Tu, Shangqing, Wang, Yuanchun, Yu, Jifan, Xie, Yuyang, Shi, Yaran, Wang, Xiaozhi, Zhang, Jing, Hou, Lei, Li, Juanzi

Large language models have achieved remarkable success on general NLP tasks, but they may fall short for domain-specific problems. Recently, various Retrieval-Augmented Large Language Models (RALLMs) are proposed to address this shortcoming. However,

Externí odkaz: http://arxiv.org/abs/2406.11681

Zobrazit plný text záznamu

Report

A Solution-based LLM API-using Methodology for Academic Information Seeking

Autor: Wang, Yuanchun, Yu, Jifan, Yao, Zijun, Zhang, Jing, Xie, Yuyang, Tu, Shangqing, Fu, Yiyang, Feng, Youhe, Zhang, Jinkai, Zhang, Jingyao, Huang, Bowen, Li, Yuanyao, Yuan, Huihui, Hou, Lei, Li, Juanzi, Tang, Jie

Applying large language models (LLMs) for academic API usage shows promise in reducing researchers' academic information seeking efforts. However, current LLM API-using methods struggle with complex API coupling commonly encountered in academic queri

Externí odkaz: http://arxiv.org/abs/2405.15165

Zobrazit plný text záznamu

Report

Explainable Few-shot Knowledge Tracing

Autor: Li, Haoxuan, Yu, Jifan, Ouyang, Yuanxin, Liu, Zhuang, Rong, Wenge, Li, Juanzi, Xiong, Zhang

Knowledge tracing (KT), aiming to mine students' mastery of knowledge by their exercise records and predict their performance on future test questions, is a critical task in educational assessment. While researchers achieved tremendous success with t

Externí odkaz: http://arxiv.org/abs/2405.14391

Zobrazit plný text záznamu

Report

Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency Checking

Autor: Zhang, Xiaokang, Yao, Zijun, Zhang, Jing, Yun, Kaifeng, Yu, Jifan, Li, Juanzi, Tang, Jie

Detecting non-factual content is a longstanding goal to increase the trustworthiness of large language models (LLMs) generations. Current factuality probes, trained using humanannotated labels, exhibit limited transferability to out-of-distribution c

Externí odkaz: http://arxiv.org/abs/2404.06742

Zobrazit plný text záznamu

Report

Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models

Autor: Liu, Yantao, Yao, Zijun, Lv, Xin, Fan, Yuchen, Cao, Shulin, Yu, Jifan, Hou, Lei, Li, Juanzi

Providing knowledge documents for large language models (LLMs) has emerged as a promising solution to update the static knowledge inherent in their parameters. However, knowledge in the document may conflict with the memory of LLMs due to outdated or

Externí odkaz: http://arxiv.org/abs/2404.03577

Zobrazit plný text záznamu

Report

Evaluating Generative Language Models in Information Extraction as Subjective Question Correction

Autor: Fan, Yuchen, Liu, Yantao, Yao, Zijun, Yu, Jifan, Hou, Lei, Li, Juanzi

Modern Large Language Models (LLMs) have showcased remarkable prowess in various tasks necessitating sophisticated cognitive behaviors. Nevertheless, a paradoxical performance discrepancy is observed, where these models underperform in seemingly elem

Externí odkaz: http://arxiv.org/abs/2404.03532

Zobrazit plný text záznamu

Report

A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation

Autor: Yu, Jifan, Zhang, Xiaohan, Xu, Yifan, Lei, Xuanyu, Yao, Zijun, Zhang, Jing, Hou, Lei, Li, Juanzi

Empowered by the large-scale pretrained language models, existing dialogue systems have demonstrated impressive performance conducting fluent and natural-sounding conversations. However, they are still plagued by the hallucination problem, causing un

Externí odkaz: http://arxiv.org/abs/2404.03491

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání