Výsledky vyhledávání

Report

A Framework for Evaluating LLMs Under Task Indeterminacy

Autor: Guerdan, Luke, Wallach, Hanna, Barocas, Solon, Chouldechova, Alexandra

Large language model (LLM) evaluations often assume there is a single correct response -- a gold label -- for each item in the evaluation corpus. However, some tasks can be ambiguous -- i.e., they provide insufficient information to identify a unique

Externí odkaz: http://arxiv.org/abs/2411.13760

Zobrazit plný text záznamu

Report

Dimensions of Generative AI Evaluation Design

Autor: Dow, P. Alex, Vaughan, Jennifer Wortman, Barocas, Solon, Atalla, Chad, Chouldechova, Alexandra, Wallach, Hanna

There are few principles or guidelines to ensure evaluations of generative AI (GenAI) models and systems are effective. To help address this gap, we propose a set of general dimensions that capture critical choices involved in GenAI evaluation design

Externí odkaz: http://arxiv.org/abs/2411.12709

Zobrazit plný text záznamu

Report

Evaluating Generative AI Systems is a Social Science Measurement Challenge

Across academia, industry, and government, there is an increasing awareness that the measurement tasks involved in evaluating generative AI (GenAI) systems are especially difficult. We argue that these measurement tasks are highly reminiscent of meas

Externí odkaz: http://arxiv.org/abs/2411.10939

Zobrazit plný text záznamu

Report

Hilbert and Fr\'echet bundle versions of the Harish-Chandra and Whittaker Plancherel Theorems

Autor: Wallach, Nolan R.

This paper, in particular, gives a complete proof of the direct integral version of the Whittaker Plancherel Theorem. The main emphasis is on certain Hilbert and Fr\'echet vector bundles over a space that has a submersion onto the tempered dual. This

Externí odkaz: http://arxiv.org/abs/2410.23226

Zobrazit plný text záznamu

Report

A Practical Multilevel Governance Framework for Autonomous and Intelligent Systems

Autor: Pöhler, Lukas D., Diepold, Klaus, Wallach, Wendell

Autonomous and intelligent systems (AIS) facilitate a wide range of beneficial applications across a variety of different domains. However, technical characteristics such as unpredictability and lack of transparency, as well as potential unintended c

Externí odkaz: http://arxiv.org/abs/2404.13719

Zobrazit plný text záznamu

Akademický článek

Academic Freedom & the Politics of the University

Autor: Scott, Joan Wallach

Publikováno v: Daedalus, 2024 Jul 01. 153(3), 149-165.

Externí odkaz: https://www.jstor.org/stable/48784947

Zobrazit plný text záznamu

Report

A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

Autor: Magooda, Ahmed, Helyar, Alec, Jackson, Kyle, Sullivan, David, Atalla, Chad, Sheng, Emily, Vann, Dan, Edgar, Richard, Palangi, Hamid, Lutz, Roman, Kong, Hongliang, Yun, Vincent, Kamal, Eslam, Zarfati, Federico, Wallach, Hanna, Bird, Sarah, Chen, Mei

We present a framework for the automated measurement of responsible AI (RAI) metrics for large language models (LLMs) and associated products and services. Our framework for automatically measuring harms from LLMs builds on existing technical and soc

Externí odkaz: http://arxiv.org/abs/2310.17750

Zobrazit plný text záznamu

Report

'One-Size-Fits-All'? Examining Expectations around What Constitute 'Fair' or 'Good' NLG System Behaviors

Autor: Lucy, Li, Blodgett, Su Lin, Shokouhi, Milad, Wallach, Hanna, Olteanu, Alexandra

Fairness-related assumptions about what constitute appropriate NLG system behaviors range from invariance, where systems are expected to behave identically for social groups, to adaptation, where behaviors should instead vary across them. To illumina

Externí odkaz: http://arxiv.org/abs/2310.15398

Zobrazit plný text záznamu

Report

Designing for Passengers' Information Needs on Fellow Travelers: A Comparison of Day and Night Rides in Shared Automated Vehicles

Autor: Flohr, Lukas A., Schuß, Martina, Wallach, Dieter P., Krüger, Antonio, Riener, Andreas

Shared automated mobility-on-demand promises efficient, sustainable, and flexible transportation. Nevertheless, security concerns, resilience, and their mutual influence - especially at night - will likely be the most critical barriers to public adop

Externí odkaz: http://arxiv.org/abs/2308.02616

Zobrazit plný text záznamu

Report

Chiral molecule candidates for trapped ion spectroscopy by ab initio calculations: from state preparation to parity violation

Autor: Landau, Arie, Eduardus, Behar, Doron, Wallach, Eliana Ruth, Pašteka, Lukáš F., Faraji, Shirin, Borschevsky, Anastasia, Shagam, Yuval

Publikováno v: J. Chem. Phys. 159, 114307 (2023)

Parity non-conservation (PNC) due to the weak interaction is predicted to give rise to enantiomer dependent vibrational constants in chiral molecules, but the phenomenon has so far eluded experimental observation. The enhanced sensitivity of molecule

Externí odkaz: http://arxiv.org/abs/2306.09788

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání