Výsledky vyhledávání - "Chern, Steffi"

Report

BeHonest: Benchmarking Honesty in Large Language Models

Autor: Chern, Steffi, Hu, Zhulin, Yang, Yuqing, Chern, Ethan, Guo, Yuan, Jin, Jiahe, Wang, Binjie, Liu, Pengfei

Previous works on Large Language Models (LLMs) have mainly focused on evaluating their helpfulness or harmlessness. However, honesty, another crucial alignment criterion, has received relatively less attention. Dishonest behaviors in LLMs, such as sp

Externí odkaz: http://arxiv.org/abs/2406.13261

Zobrazit plný text záznamu

Report

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and s

Externí odkaz: http://arxiv.org/abs/2406.12753

Zobrazit plný text záznamu

Report

Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate

Autor: Chern, Steffi, Chern, Ethan, Neubig, Graham, Liu, Pengfei

Despite the utility of Large Language Models (LLMs) across a wide range of tasks and scenarios, developing a method for reliably evaluating LLMs across varied contexts continues to be challenging. Modern evaluation approaches often use LLMs to assess

Externí odkaz: http://arxiv.org/abs/2401.16788

Zobrazit plný text záznamu

Report

Combating Adversarial Attacks with Multi-Agent Debate

Autor: Chern, Steffi, Fan, Zhen, Liu, Andy

While state-of-the-art language models have achieved impressive results, they remain susceptible to inference-time adversarial attacks, such as adversarial prompts generated by red teams arXiv:2209.07858. One approach proposed to improve the general

Externí odkaz: http://arxiv.org/abs/2401.05998

Zobrazit plný text záznamu

Report

Align on the Fly: Adapting Chatbot Behavior to Established Norms

Autor: Xu, Chunpu, Chern, Steffi, Chern, Ethan, Zhang, Ge, Wang, Zekun, Liu, Ruibo, Li, Jing, Fu, Jie, Liu, Pengfei

In this paper, we aim to align large language models with the ever-changing, complex, and diverse human values (e.g., social norms) across time and locations. This presents a challenge to existing alignment techniques, such as supervised fine-tuning,

Externí odkaz: http://arxiv.org/abs/2312.15907

Zobrazit plný text záznamu

Report

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

Autor: Chern, I-Chun, Chern, Steffi, Chen, Shiqi, Yuan, Weizhe, Feng, Kehua, Zhou, Chunting, He, Junxian, Neubig, Graham, Liu, Pengfei

The emergence of generative pre-trained models has facilitated the synthesis of high-quality text, but it has also posed challenges in identifying factual errors in the generated text. In particular: (1) A wider range of tasks now face an increasing

Externí odkaz: http://arxiv.org/abs/2307.13528

Zobrazit plný text záznamu

Akademický článek

Automated Analysis of Fluency Behaviors in Aphasia.

Autor: Fromm, Davida¹ fromm@andrew.cmu.edu, Chern, Steffi², Geng, Zihan², Kim, Mason², Greenhouse, Joel², MacWhinney, Brian¹

Publikováno v: Journal of Speech, Language & Hearing Research. Jul2024, Vol. 67, p2333-2342. 10p.

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání