Výsledky vyhledávání - "Wilson Andrew"

Report

Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models

Autor: Lotfi, Sanae, Kuang, Yilun, Amos, Brandon, Goldblum, Micah, Finzi, Marc, Wilson, Andrew Gordon

Large language models (LLMs) with billions of parameters excel at predicting the next token in a sequence. Recent work computes non-vacuous compression-based generalization bounds for LLMs, but these bounds are vacuous for large models at the billion

Externí odkaz: http://arxiv.org/abs/2407.18158

Zobrazit plný text záznamu

Report

Just How Flexible are Neural Networks in Practice?

Autor: Shwartz-Ziv, Ravid, Goldblum, Micah, Bansal, Arpit, Bruss, C. Bayan, LeCun, Yann, Wilson, Andrew Gordon

It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized models. In practice, however, we only find solutions accessi

Externí odkaz: http://arxiv.org/abs/2406.11463

Zobrazit plný text záznamu

Report

Scalable and Flexible Causal Discovery with an Efficient Test for Adjacency

Autor: Amin, Alan Nawzad, Wilson, Andrew Gordon

To make accurate predictions, understand mechanisms, and design interventions in systems of many variables, we wish to learn causal graphs from large scale data. Unfortunately the space of all possible causal graphs is enormous so scalably and accura

Externí odkaz: http://arxiv.org/abs/2406.09177

Zobrazit plný text záznamu

Report

Large Language Models Must Be Taught to Know What They Don't Know

Autor: Kapoor, Sanyam, Gruver, Nate, Roberts, Manley, Collins, Katherine, Pal, Arka, Bhatt, Umang, Weller, Adrian, Dooley, Samuel, Goldblum, Micah, Wilson, Andrew Gordon

When using large language models (LLMs) in high-stakes applications, we need to know when we can trust their predictions. Some works argue that prompting high-performance LLMs is sufficient to produce calibrated uncertainties, while others introduce

Externí odkaz: http://arxiv.org/abs/2406.08391

Zobrazit plný text záznamu

Report

Transferring Knowledge from Large Foundation Models to Small Downstream Models

Autor: Qiu, Shikai, Han, Boran, Maddix, Danielle C., Zhang, Shuai, Wang, Yuyang, Wilson, Andrew Gordon

How do we transfer the relevant knowledge from ever larger foundation models into small, task-specific downstream models that can run at much lower costs? Standard transfer learning using pre-trained weights as the initialization transfers limited in

Externí odkaz: http://arxiv.org/abs/2406.07337

Zobrazit plný text záznamu

Report

Compute Better Spent: Replacing Dense Layers with Structured Matrices

Autor: Qiu, Shikai, Potapczynski, Andres, Finzi, Marc, Goldblum, Micah, Wilson, Andrew Gordon

Dense linear layers are the dominant computational bottleneck in foundation models. Identifying more efficient alternatives to dense matrices has enormous potential for building more compute-efficient models, as exemplified by the success of convolut

Externí odkaz: http://arxiv.org/abs/2406.06248

Zobrazit plný text záznamu

Report

Modeling Caption Diversity in Contrastive Vision-Language Pretraining

Autor: Lavoie, Samuel, Kirichenko, Polina, Ibrahim, Mark, Assran, Mahmoud, Wilson, Andrew Gordon, Courville, Aaron, Ballas, Nicolas

There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by mapping an image and its caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an

Externí odkaz: http://arxiv.org/abs/2405.00740

Zobrazit plný text záznamu

Report

Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion

Autor: Souri, Hossein, Bansal, Arpit, Kazemi, Hamid, Fowl, Liam, Saha, Aniruddha, Geiping, Jonas, Wilson, Andrew Gordon, Chellappa, Rama, Goldstein, Tom, Goldblum, Micah

Modern neural networks are often trained on massive datasets that are web scraped with minimal human inspection. As a result of this insecure curation pipeline, an adversary can poison or backdoor the resulting model by uploading malicious data to th

Externí odkaz: http://arxiv.org/abs/2403.16365

Zobrazit plný text záznamu

Report

BlendScape: Enabling Unified and Personalized Video-Conferencing Environments through Generative AI

Autor: Rajaram, Shwetha, Numan, Nels, Kumaravel, Balasaravanan Thoravi, Marquardt, Nicolai, Wilson, Andrew D.

Today's video-conferencing tools support a rich range of professional and social activities, but their generic, grid-based environments cannot be easily adapted to meet the varying needs of distributed collaborators. To enable end-user customization,

Externí odkaz: http://arxiv.org/abs/2403.13947

Zobrazit plný text záznamu

Report

Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors

Autor: Rudner, Tim G. J., Zhang, Ya Shi, Wilson, Andrew Gordon, Kempe, Julia

Machine learning models often perform poorly under subpopulation shifts in the data distribution. Developing methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this

Externí odkaz: http://arxiv.org/abs/2403.09869

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání