Výsledky vyhledávání - "Caciularu, Avi"

Report

CoverBench: A Challenging Benchmark for Complex Claim Verification

Autor: Jacovi, Alon, Ambar, Moran, Ben-David, Eyal, Shaham, Uri, Feder, Amir, Geva, Mor, Marcus, Dror, Caciularu, Avi

There is a growing line of research on verifying the correctness of language models' outputs. At the same time, LMs are being used to tackle complex queries that require reasoning. We introduce CoverBench, a challenging benchmark focused on verifying

Externí odkaz: http://arxiv.org/abs/2408.03325

Zobrazit plný text záznamu

Report

SEAM: A Stochastic Benchmark for Multi-Document Tasks

Autor: Lior, Gili, Caciularu, Avi, Cattan, Arie, Levy, Shahar, Shapira, Ori, Stanovsky, Gabriel

Various tasks, such as summarization, multi-hop question answering, or coreference resolution, are naturally phrased over collections of real-world documents. Such tasks present a unique set of challenges, revolving around the lack of coherent narrat

Externí odkaz: http://arxiv.org/abs/2406.16086

Zobrazit plný text záznamu

Report

Identifying User Goals from UI Trajectories

Autor: Berkovitch, Omri, Caduri, Sapir, Kahlon, Noam, Efros, Anatoly, Caciularu, Avi, Dagan, Ido

Autonomous agents that interact with graphical user interfaces (GUIs) hold significant potential for enhancing user experiences. To further improve these experiences, agents need to be personalized and proactive. By effectively comprehending user int

Externí odkaz: http://arxiv.org/abs/2406.14314

Zobrazit plný text záznamu

Report

Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Autor: Cattan, Arie, Jacovi, Alon, Fabrikant, Alex, Herzig, Jonathan, Aharoni, Roee, Rashkin, Hannah, Marcus, Dror, Hassidim, Avinatan, Matias, Yossi, Szpektor, Idan, Caciularu, Avi

Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in this scen

Externí odkaz: http://arxiv.org/abs/2406.13632

Zobrazit plný text záznamu

Report

TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools

Autor: Caciularu, Avi, Jacovi, Alon, Ben-David, Eyal, Goldshtein, Sasha, Schuster, Tal, Herzig, Jonathan, Elidan, Gal, Globerson, Amir

Large Language Models (LLMs) often do not perform well on queries that require the aggregation of information across texts. To better evaluate this setting and facilitate modeling efforts, we introduce TACT - Text And Calculations through Tables, a d

Externí odkaz: http://arxiv.org/abs/2406.03618

Zobrazit plný text záznamu

Report

Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

Autor: Jurenka, Irina, Kunesch, Markus, McKee, Kevin R., Gillick, Daniel, Zhu, Shaojian, Wiltberger, Sara, Phal, Shubham Milind, Hermann, Katherine, Kasenberg, Daniel, Bhoopchand, Avishkar, Anand, Ankit, Pîslar, Miruna, Chan, Stephanie, Wang, Lisa, She, Jennifer, Mahmoudieh, Parsa, Rysbek, Aliya, Ko, Wei-Jen, Huber, Andrea, Wiltshire, Brett, Elidan, Gal, Rabin, Roni, Rubinovitz, Jasmin, Pitaru, Amit, McAllister, Mac, Wilkowski, Julia, Choi, David, Engelberg, Roee, Hackmon, Lidan, Levin, Adva, Griffin, Rachel, Sears, Michael, Bar, Filip, Mesar, Mia, Jabbour, Mana, Chaudhry, Arslan, Cohan, James, Thiagarajan, Sridhar, Levine, Nir, Brown, Ben, Gorur, Dilan, Grant, Svetlana, Hashimshoni, Rachel, Weidinger, Laura, Hu, Jieru, Chen, Dawn, Dolecki, Kuba, Akbulut, Canfer, Bileschi, Maxwell, Culp, Laura, Dong, Wen-Xin, Marchal, Nahema, Van Deman, Kelsie, Misra, Hema Bajaj, Duah, Michael, Ambar, Moran, Caciularu, Avi, Lefdal, Sandra, Summerfield, Chris, An, James, Kamienny, Pierre-Alexandre, Mohdi, Abhinit, Strinopoulous, Theofilos, Hale, Annie, Anderson, Wayne, Cobo, Luis C., Efron, Niv, Ananda, Muktha, Mohamed, Shakir, Heymans, Maureen, Ghahramani, Zoubin, Matias, Yossi, Gomes, Ben, Ibrahim, Lila

A major challenge facing the world is the provision of equitable and universal access to quality education. Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every

Externí odkaz: http://arxiv.org/abs/2407.12687

Zobrazit plný text záznamu

Report

Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance

Autor: Goldman, Omer, Caciularu, Avi, Eyal, Matan, Cao, Kris, Szpektor, Idan, Tsarfaty, Reut

Despite it being the cornerstone of BPE, the most common tokenization algorithm, the importance of compression in the tokenization process is still unclear. In this paper, we argue for the theoretical importance of compression, that can be viewed as

Externí odkaz: http://arxiv.org/abs/2403.06265

Zobrazit plný text záznamu

Report

Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models

Autor: Ghandeharioun, Asma, Caciularu, Avi, Pearce, Adam, Dixon, Lucas, Geva, Mor

Understanding the internal representations of large language models (LLMs) can help explain models' behavior and verify their alignment with human values. Given the capabilities of LLMs in generating human-understandable text, we propose leveraging t

Externí odkaz: http://arxiv.org/abs/2401.06102

Zobrazit plný text záznamu

Report

Optimizing Retrieval-augmented Reader Models via Token Elimination

Autor: Berchansky, Moshe, Izsak, Peter, Caciularu, Avi, Dagan, Ido, Wasserblat, Moshe

Fusion-in-Decoder (FiD) is an effective retrieval-augmented language model applied across a variety of open-domain tasks, such as question answering, fact checking, etc. In FiD, supporting passages are first retrieved and then processed using a gener

Externí odkaz: http://arxiv.org/abs/2310.13682

Zobrazit plný text záznamu

Report

The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language Models

Autor: Slobodkin, Aviv, Goldman, Omer, Caciularu, Avi, Dagan, Ido, Ravfogel, Shauli

Large language models (LLMs) have been shown to possess impressive capabilities, while also raising crucial concerns about the faithfulness of their responses. A primary issue arising in this context is the management of (un)answerable queries by LLM

Externí odkaz: http://arxiv.org/abs/2310.11877

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání