Výsledky vyhledávání - "Nan, Linyong"

Report

DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents

Autor: Zhao, Yilun, Long, Yitao, Liu, Hongjun, Kamoi, Ryo, Nan, Linyong, Chen, Lyuhao, Liu, Yixin, Tang, Xiangru, Zhang, Rui, Cohan, Arman

Recent LLMs have demonstrated remarkable performance in solving exam-like math word problems. However, the degree to which these numerical reasoning skills are effective in real-world scenarios, particularly in expert domains, is still largely unexpl

Externí odkaz: http://arxiv.org/abs/2311.09805

Zobrazit plný text záznamu

Report

On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering

Autor: Nan, Linyong, Zhang, Ellen, Zou, Weijin, Zhao, Yilun, Zhou, Wenfei, Cohan, Arman

This study introduces a new long-form database question answering dataset designed to evaluate how Large Language Models (LLMs) interact with a SQL interpreter. The task necessitates LLMs to strategically generate multiple SQL queries to retrieve suf

Externí odkaz: http://arxiv.org/abs/2311.09721

Zobrazit plný text záznamu

Report

RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations

Autor: Zhao, Yilun, Zhao, Chen, Nan, Linyong, Qi, Zhenting, Zhang, Wenlin, Tang, Xiangru, Mi, Boyu, Radev, Dragomir

Despite significant progress having been made in question answering on tabular data (Table QA), it's unclear whether, and to what extent existing Table QA models are robust to task-specific perturbations, e.g., replacing key question entities or shuf

Externí odkaz: http://arxiv.org/abs/2306.14321

Zobrazit plný text záznamu

Report

Investigating Table-to-Text Generation Capabilities of LLMs in Real-World Information Seeking Scenarios

Autor: Zhao, Yilun, Zhang, Haowei, Si, Shengyun, Nan, Linyong, Tang, Xiangru, Cohan, Arman

Tabular data is prevalent across various industries, necessitating significant time and effort for users to understand and manipulate for their information-seeking purposes. The advancements in large language models (LLMs) have shown enormous potenti

Externí odkaz: http://arxiv.org/abs/2305.14987

Zobrazit plný text záznamu

Report

QTSumm: Query-Focused Summarization over Tabular Data

Autor: Zhao, Yilun, Qi, Zhenting, Nan, Linyong, Mi, Boyu, Liu, Yixin, Zou, Weijin, Han, Simeng, Chen, Ruizhe, Tang, Xiangru, Xu, Yumo, Radev, Dragomir, Cohan, Arman

People primarily consult tables to conduct data analysis or answer specific questions. Text generation systems that can provide accurate table summaries tailored to users' information needs can facilitate more efficient access to relevant data insigh

Externí odkaz: http://arxiv.org/abs/2305.14303

Zobrazit plný text záznamu

Report

Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies

Autor: Nan, Linyong, Zhao, Yilun, Zou, Weijin, Ri, Narutatsu, Tae, Jaesung, Zhang, Ellen, Cohan, Arman, Radev, Dragomir

In-context learning (ICL) has emerged as a new approach to various natural language processing tasks, utilizing large language models (LLMs) to make predictions based on context that has been supplemented with a few examples or task-specific instruct

Externí odkaz: http://arxiv.org/abs/2305.12586

Zobrazit plný text záznamu

Report

LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control

Autor: Zhao, Yilun, Qi, Zhenting, Nan, Linyong, Flores, Lorenzo Jaime Yu, Radev, Dragomir

Logical Table-to-Text (LT2T) generation is tasked with generating logically faithful sentences from tables. There currently exists two challenges in the field: 1) Faithfulness: how to generate sentences that are factually correct given the table cont

Externí odkaz: http://arxiv.org/abs/2302.02962

Zobrazit plný text záznamu

Report

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

Autor: Liu, Yixin, Fabbri, Alexander R., Liu, Pengfei, Zhao, Yilun, Nan, Linyong, Han, Ruilin, Han, Simeng, Joty, Shafiq, Wu, Chien-Sheng, Xiong, Caiming, Radev, Dragomir

Human evaluation is the foundation upon which the evaluation of both summarization systems and automatic metrics rests. However, existing human evaluation studies for summarization either exhibit a low inter-annotator agreement or have insufficient s

Externí odkaz: http://arxiv.org/abs/2212.07981

Zobrazit plný text záznamu

Report

ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples

Autor: Zhao, Yilun, Nan, Linyong, Qi, Zhenting, Zhang, Rui, Radev, Dragomir

Reasoning over tabular data requires both table structure understanding and a broad set of table reasoning skills. Current models with table-specific architectures and pre-training methods perform well on understanding table structures, but they stil

Externí odkaz: http://arxiv.org/abs/2210.12374

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání