Výsledky vyhledávání

Report

Predicting Emergent Capabilities by Finetuning

Autor: Snell, Charlie, Wallace, Eric, Klein, Dan, Levine, Sergey

A fundamental open challenge in modern LLM scaling is the lack of understanding around emergent capabilities. In particular, language model pretraining loss is known to be highly predictable as a function of compute. However, downstream capabilities

Externí odkaz: http://arxiv.org/abs/2411.16035

Zobrazit plný text záznamu

Report

Improving Predictor Reliability with Selective Recalibration

Autor: Zollo, Thomas P., Deng, Zhun, Snell, Jake C., Pitassi, Toniann, Zemel, Richard

A reliable deep learning system should be able to accurately express its confidence with respect to its predictions, a quality known as calibration. One of the most effective ways to produce reliable confidence estimates with a pre-trained model is b

Externí odkaz: http://arxiv.org/abs/2410.05407

Zobrazit plný text záznamu

Report

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Autor: Snell, Charlie, Lee, Jaehoon, Xu, Kelvin, Kumar, Aviral

Enabling LLMs to improve their outputs by using more test-time computation is a critical step towards building generally self-improving agents that can operate on open-ended natural language. In this paper, we study the scaling of inference-time comp

Externí odkaz: http://arxiv.org/abs/2408.03314

Zobrazit plný text záznamu

Report

Sample size for developing a prediction model with a binary outcome: targeting precise individual risk estimates to improve clinical decisions and fairness

Autor: Riley, Richard D, Collins, Gary S, Whittle, Rebecca, Archer, Lucinda, Snell, Kym IE, Dhiman, Paula, Kirton, Laura, Legha, Amardeep, Liu, Xiaoxuan, Denniston, Alastair, Harrell Jr, Frank E, Wynants, Laure, Martin, Glen P, Ensor, Joie

When developing a clinical prediction model, the sample size of the development dataset is a key consideration. Small sample sizes lead to greater concerns of overfitting, instability, poor performance and lack of fairness. Previous research has outl

Externí odkaz: http://arxiv.org/abs/2407.09293

Zobrazit plný text záznamu

Report

Targeting low micro-roughness for 3D printed aluminium mirrors using a hot isostatic press

Autor: Atkins, Carolyn, Chahid, Younes, Lister, Gregory, Tuck, Rhys, Kotlewski, Richard, Snell, Robert M., Livera, Elaine R., Faour, Mariam, Todd, Iain, Deffley, Robert, Shipley, James, Walsh, Tom, Gardstam, Johannes, Bourgenot, Cyril, White, Paul, Davies, Spencer, Tammas-Williams, Samuel

Additive manufacturing (AM; 3D printing) in aluminium using laser powder bed fusion provides a new design space for lightweight mirror production. Printing layer-by-layer enables the use of intricate lattices for mass reduction, as well as organic sh

Externí odkaz: http://arxiv.org/abs/2407.07405

Zobrazit plný text záznamu

Report

Extended sample size calculations for evaluation of prediction models using a threshold for classification

Autor: Whittle, Rebecca, Ensor, Joie, Archer, Lucinda, Collins, Gary S., Dhiman, Paula, Denniston, Alastair, Alderman, Joseph, Legha, Amardeep, van Smeden, Maarten, Moons, Karel G., Cazier, Jean-Baptiste, Riley, Richard D., Snell, Kym I. E.

When evaluating the performance of a model for individualised risk prediction, the sample size needs to be large enough to precisely estimate the performance measures of interest. Current sample size guidance is based on precisely estimating calibrat

Externí odkaz: http://arxiv.org/abs/2406.19673

Zobrazit plný text záznamu

Report

Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases

Autor: Marjieh, Raja, Kumar, Sreejan, Campbell, Declan, Zhang, Liyi, Bencomo, Gianluca, Snell, Jake, Griffiths, Thomas L.

Humans rely on strong inductive biases to learn from few examples and abstract useful information from sensory data. Instilling such biases in machine learning models has been shown to improve their performance on various benchmarks including few-sho

Externí odkaz: http://arxiv.org/abs/2405.19420

Zobrazit plný text záznamu

Report

Enhancing Space Situational Awareness to Mitigate Risk: A Single-Case Study in the Misidentification of a Recently-Launched Starlink Satellite Train as a UAP in Commercial Aviation

Autor: Buettner, Douglas J., Griffiths, Richard E., Snell, Nick, Stilley, John

Over the past several years, the misidentification of SpaceX Starlink satellites as Unidentified Aerial Phenomena (UAP) by pilots and laypersons has generated unnecessary aviation risk and confusion. The many deployment and orbital evolution strategi

Externí odkaz: http://arxiv.org/abs/2403.08155

Zobrazit plný text záznamu

Report

LLMAuditor: A Framework for Auditing Large Language Models Using Human-in-the-Loop

Autor: Amirizaniani, Maryam, Yao, Jihan, Lavergne, Adrian, Okada, Elizabeth Snell, Chadha, Aman, Roosta, Tanya, Shah, Chirag

As Large Language Models (LLMs) become more pervasive across various users and scenarios, identifying potential issues when using these models becomes essential. Examples of such issues include: bias, inconsistencies, and hallucination. Although audi

Externí odkaz: http://arxiv.org/abs/2402.09346

Zobrazit plný text záznamu

Report

A Metalearned Neural Circuit for Nonparametric Bayesian Inference

Autor: Snell, Jake C., Bencomo, Gianluca, Griffiths, Thomas L.

Most applications of machine learning to classification assume a closed set of balanced classes. This is at odds with the real world, where class occurrence statistics often follow a long-tailed power-law distribution and it is unlikely that all clas

Externí odkaz: http://arxiv.org/abs/2311.14601

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání