Výsledky vyhledávání - "Borenstein, Nadav"

Report

Can Transformers Learn $n$-gram Language Models?

Autor: Svete, Anej, Borenstein, Nadav, Zhou, Mike, Augenstein, Isabelle, Cotterell, Ryan

Much theoretical work has described the ability of transformers to represent formal languages. However, linking theoretical results to empirical performance is not straightforward due to the complex interplay between the architecture, the learning al

Externí odkaz: http://arxiv.org/abs/2410.03001

Zobrazit plný text záznamu

Report

Revealing Fine-Grained Values and Opinions in Large Language Models

Autor: Wright, Dustin, Arora, Arnav, Borenstein, Nadav, Yadav, Srishti, Belongie, Serge, Augenstein, Isabelle

Uncovering latent values and opinions in large language models (LLMs) can help identify biases and mitigate potential harm. Recently, this has been approached by presenting LLMs with survey questions and quantifying their stances towards morally and

Externí odkaz: http://arxiv.org/abs/2406.19238

Zobrazit plný text záznamu

Report

What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages

Autor: Borenstein, Nadav, Svete, Anej, Chan, Robin, Valvoda, Josef, Nowak, Franz, Augenstein, Isabelle, Chodroff, Eleanor, Cotterell, Ryan

What can large language models learn? By definition, language models (LM) are distributions over strings. Therefore, an intuitive way of addressing the above question is to formalize it as a matter of learnability of classes of distributions over str

Externí odkaz: http://arxiv.org/abs/2406.04289

Zobrazit plný text záznamu

Report

Investigating Human Values in Online Communities

Autor: Borenstein, Nadav, Arora, Arnav, Kaffee, Lucie-Aimée, Augenstein, Isabelle

Human values play a vital role as an analytical tool in social sciences, enabling the study of diverse dimensions within society as a whole and among individual communities. This paper addresses the limitations of traditional survey-based studies of

Externí odkaz: http://arxiv.org/abs/2402.14177

Zobrazit plný text záznamu

Report

Imitation of Life: A Search Engine for Biologically Inspired Design

Autor: Emuna, Hen, Borenstein, Nadav, Qian, Xin, Kang, Hyeonsu, Chan, Joel, Kittur, Aniket, Shahaf, Dafna

Biologically Inspired Design (BID), or Biomimicry, is a problem-solving methodology that applies analogies from nature to solve engineering challenges. For example, Speedo engineers designed swimsuits based on shark skin. Finding relevant biological

Externí odkaz: http://arxiv.org/abs/2312.12681

Zobrazit plný text záznamu

Report

Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

Autor: Wang, Yuxia, Reddy, Revanth Gangi, Mujahid, Zain Muhammad, Arora, Arnav, Rubashevskii, Aleksandr, Geng, Jiahui, Afzal, Osama Mohammed, Pan, Liangming, Borenstein, Nadav, Pillai, Aditya, Augenstein, Isabelle, Gurevych, Iryna, Nakov, Preslav

The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. In this work, we present a holistic end-to-end solution for annotating the factuality

Externí odkaz: http://arxiv.org/abs/2311.09000

Zobrazit plný text záznamu

Report

PHD: Pixel-Based Language Modeling of Historical Documents

Autor: Borenstein, Nadav, Rust, Phillip, Elliott, Desmond, Augenstein, Isabelle

The digitisation of historical documents has provided historians with unprecedented research opportunities. Yet, the conventional approach to analysing historical documents involves converting them from images to text using OCR, a process that overlo

Externí odkaz: http://arxiv.org/abs/2310.18343

Zobrazit plný text záznamu

Report

Measuring Intersectional Biases in Historical Documents

Autor: Borenstein, Nadav, Stańczak, Karolina, Rolskov, Thea, Perez, Natália da Silva, Käfer, Natacha Klein, Augenstein, Isabelle

Data-driven analyses of biases in historical texts can help illuminate the origin and development of biases prevailing in modern society. However, digitised historical documents pose a challenge for NLP practitioners as these corpora suffer from erro

Externí odkaz: http://arxiv.org/abs/2305.12376

Zobrazit plný text záznamu

Report

Multilingual Event Extraction from Historical Newspaper Adverts

Autor: Borenstein, Nadav, Perez, Natalia da Silva, Augenstein, Isabelle

NLP methods can aid historians in analyzing textual materials in greater volumes than manually feasible. Developing such methods poses substantial challenges though. First, acquiring large, annotated historical datasets is difficult, as only domain e

Externí odkaz: http://arxiv.org/abs/2305.10928

Zobrazit plný text záznamu

Report

Temporally stable video segmentation without video annotations

Autor: Azulay, Aharon, Halperin, Tavi, Vantzos, Orestis, Borenstein, Nadav, Bibi, Ofir

Publikováno v: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3449-3458. 2022

Temporally consistent dense video annotations are scarce and hard to collect. In contrast, image segmentation datasets (and pre-trained models) are ubiquitous, and easier to label for any novel task. In this paper, we introduce a method to adapt stil

Externí odkaz: http://arxiv.org/abs/2110.08893

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání