Výsledky vyhledávání - "Jacovi, Alon"

Report

CoverBench: A Challenging Benchmark for Complex Claim Verification

Autor: Jacovi, Alon, Ambar, Moran, Ben-David, Eyal, Shaham, Uri, Feder, Amir, Geva, Mor, Marcus, Dror, Caciularu, Avi

There is a growing line of research on verifying the correctness of language models' outputs. At the same time, LMs are being used to tackle complex queries that require reasoning. We introduce CoverBench, a challenging benchmark focused on verifying

Externí odkaz: http://arxiv.org/abs/2408.03325

Zobrazit plný text záznamu

Report

Data Contamination Report from the 2024 CONDA Shared Task

The 1st Workshop on Data Contamination (CONDA 2024) focuses on all relevant aspects of data contamination in natural language processing, where data contamination is understood as situations where evaluation data is included in pre-training corpora u

Externí odkaz: http://arxiv.org/abs/2407.21530

Zobrazit plný text záznamu

Report

Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP

Autor: Goldman, Omer, Jacovi, Alon, Slobodkin, Aviv, Maimon, Aviya, Dagan, Ido, Tsarfaty, Reut

Improvements in language models' capabilities have pushed their applications towards longer contexts, making long-context evaluation and development an active research area. However, many disparate use-cases are grouped together under the umbrella te

Externí odkaz: http://arxiv.org/abs/2407.00402

Zobrazit plný text záznamu

Report

Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Autor: Cattan, Arie, Jacovi, Alon, Fabrikant, Alex, Herzig, Jonathan, Aharoni, Roee, Rashkin, Hannah, Marcus, Dror, Hassidim, Avinatan, Matias, Yossi, Szpektor, Idan, Caciularu, Avi

Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in this scen

Externí odkaz: http://arxiv.org/abs/2406.13632

Zobrazit plný text záznamu

Report

TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools

Autor: Caciularu, Avi, Jacovi, Alon, Ben-David, Eyal, Goldshtein, Sasha, Schuster, Tal, Herzig, Jonathan, Elidan, Gal, Globerson, Amir

Large Language Models (LLMs) often do not perform well on queries that require the aggregation of information across texts. To better evaluate this setting and facilitate modeling efforts, we introduce TACT - Text And Calculations through Tables, a d

Externí odkaz: http://arxiv.org/abs/2406.03618

Zobrazit plný text záznamu

Report

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

Autor: Jacovi, Alon, Bitton, Yonatan, Bohnet, Bernd, Herzig, Jonathan, Honovich, Or, Tseng, Michael, Collins, Michael, Aharoni, Roee, Geva, Mor

Prompting language models to provide step-by-step answers (e.g., "Chain-of-Thought") is the prominent approach for complex reasoning tasks, where more accurate reasoning chains typically improve downstream task performance. Recent literature discusse

Externí odkaz: http://arxiv.org/abs/2402.00559

Zobrazit plný text záznamu

Report

A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Autor: Jacovi, Alon, Caciularu, Avi, Herzig, Jonathan, Aharoni, Roee, Bohnet, Bernd, Geva, Mor

A growing area of research investigates augmenting language models with tools (e.g., search engines, calculators) to overcome their shortcomings (e.g., missing or incorrect knowledge, incorrect logical inferences). Various few-shot tool-usage strateg

Externí odkaz: http://arxiv.org/abs/2310.10062

Zobrazit plný text záznamu

Report

Unpacking Human-AI Interaction in Safety-Critical Industries: A Systematic Literature Review

Autor: Bach, Tita A., Kristiansen, Jenny K., Babic, Aleksandar, Jacovi, Alon

Publikováno v: Access-2024-19782

Ensuring quality human-AI interaction (HAII) in safety-critical industries is essential. Failure to do so can lead to catastrophic and deadly consequences. Despite this urgency, existing research on HAII is limited, fragmented, and inconsistent. We p

Externí odkaz: http://arxiv.org/abs/2310.03392

Zobrazit plný text záznamu

Report

Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks

Autor: Jacovi, Alon, Caciularu, Avi, Goldman, Omer, Goldberg, Yoav

Data contamination has become prevalent and challenging with the rise of models pretrained on large automatically-crawled corpora. For closed models, the training data becomes a trade secret, and even for open models, it is not trivial to detect cont

Externí odkaz: http://arxiv.org/abs/2305.10160

Zobrazit plný text záznamu

Report

Neighboring Words Affect Human Interpretation of Saliency Explanations

Autor: Jacovi, Alon, Schuff, Hendrik, Adel, Heike, Vu, Ngoc Thang, Goldberg, Yoav

Word-level saliency explanations ("heat maps over words") are often used to communicate feature-attribution in text-based models. Recent studies found that superficial factors such as word length can distort human interpretation of the communicated s

Externí odkaz: http://arxiv.org/abs/2305.02679

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání