Výsledky vyhledávání - "Arora, Aryaman"

Report

ReFT: Representation Finetuning for Language Models

Autor: Wu, Zhengxuan, Arora, Aryaman, Wang, Zheng, Geiger, Atticus, Jurafsky, Dan, Manning, Christopher D., Potts, Christopher

Parameter-efficient finetuning (PEFT) methods seek to adapt large neural models via updates to a small number of weights. However, much prior interpretability work has shown that representations encode rich semantic information, suggesting that editi

Externí odkaz: http://arxiv.org/abs/2404.03592

Zobrazit plný text záznamu

Report

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Autor: Wu, Zhengxuan, Geiger, Atticus, Arora, Aryaman, Huang, Jing, Wang, Zheng, Goodman, Noah D., Manning, Christopher D., Potts, Christopher

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce $\textbf{pyvene}$, an open-source Python library tha

Externí odkaz: http://arxiv.org/abs/2403.07809

Zobrazit plný text záznamu

Report

CausalGym: Benchmarking causal interpretability methods on linguistic tasks

Autor: Arora, Aryaman, Jurafsky, Dan, Potts, Christopher

Language models (LMs) have proven to be powerful tools for psycholinguistic research, but most prior work has focused on purely behavioural measures (e.g., surprisal comparisons). At the same time, research in model interpretability has begun to illu

Externí odkaz: http://arxiv.org/abs/2402.12560

Zobrazit plný text záznamu

Report

Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens

Autor: San, Nay, Paraskevopoulos, Georgios, Arora, Aryaman, He, Xiluo, Kaur, Prabhjot, Adams, Oliver, Jurafsky, Dan

While massively multilingual speech models like wav2vec 2.0 XLSR-128 can be directly fine-tuned for automatic speech recognition (ASR), downstream performance can still be relatively poor on languages that are under-represented in the pre-training da

Externí odkaz: http://arxiv.org/abs/2402.02302

Zobrazit plný text záznamu

Report

A Reply to Makelov et al. (2023)'s 'Interpretability Illusion' Arguments

Autor: Wu, Zhengxuan, Geiger, Atticus, Huang, Jing, Arora, Aryaman, Icard, Thomas, Potts, Christopher, Goodman, Noah D.

We respond to the recent paper by Makelov et al. (2023), which reviews subspace interchange intervention methods like distributed alignment search (DAS; Geiger et al. 2023) and claims that these methods potentially cause "interpretability illusions".

Externí odkaz: http://arxiv.org/abs/2401.12631

Zobrazit plný text záznamu

Report

IruMozhi: Automatically classifying diglossia in Tamil

Autor: Prasanna, Kabilan, Arora, Aryaman

Tamil, a Dravidian language of South Asia, is a highly diglossic language with two very different registers in everyday use: Literary Tamil (preferred in writing and formal communication) and Spoken Tamil (confined to speech and informal media). Spok

Externí odkaz: http://arxiv.org/abs/2311.07804

Zobrazit plný text záznamu

Report

Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

Autor: Palit, Vedant, Pandey, Rohan, Arora, Aryaman, Liang, Paul Pu

Mechanistic interpretability seeks to understand the neural mechanisms that enable specific behaviors in Large Language Models (LLMs) by leveraging causality-based methods. While these approaches have identified neural circuits that copy spans of tex

Externí odkaz: http://arxiv.org/abs/2308.14179

Zobrazit plný text záznamu

Report

Jambu: A historical linguistic database for South Asian languages

Autor: Arora, Aryaman, Farris, Adam, Basu, Samopriya, Kolichala, Suresh

We introduce Jambu, a cognate database of South Asian languages which unifies dozens of previous sources in a structured and accessible format. The database includes 287k lemmata from 602 lects, grouped together in 23k sets of cognates. We outline th

Externí odkaz: http://arxiv.org/abs/2306.02514

Zobrazit plný text záznamu

Report

CGELBank Annotation Manual v1.1

Autor: Reynolds, Brett, Schneider, Nathan, Arora, Aryaman

CGELBank is a treebank and associated tools based on a syntactic formalism for English derived from the Cambridge Grammar of the English Language. This document lays out the particularities of the CGELBank annotation scheme.

Externí odkaz: http://arxiv.org/abs/2305.17347

Zobrazit plný text záznamu

Report

Localizing Model Behavior with Path Patching

Autor: Goldowsky-Dill, Nicholas, MacLeod, Chris, Sato, Lucas, Arora, Aryaman

Localizing behaviors of neural networks to a subset of the network's components or a subset of interactions between components is a natural first step towards analyzing network mechanisms and possible failure modes. Existing work is often qualitative

Externí odkaz: http://arxiv.org/abs/2304.05969

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání