Zobrazeno 1 - 10
of 1 610
pro vyhledávání: '"Mallen P"'
While the activations of neurons in deep neural networks usually do not have a simple human-understandable interpretation, sparse autoencoders (SAEs) can be used to transform these activations into a higher-dimensional latent space which may be more
Externí odkaz:
http://arxiv.org/abs/2410.13928
Autor:
Mallen, Alex, Belrose, Nora
Scalable oversight studies methods of training and evaluating AI systems in domains where human judgment is unreliable or expensive, such as scientific research and software engineering in complex codebases. Most work in this area has focused on meth
Externí odkaz:
http://arxiv.org/abs/2410.13215
The distributional simplicity bias (DSB) posits that neural networks learn low-order moments of the data distribution first, before moving on to higher-order correlations. In this work, we present compelling new evidence for the DSB by showing that n
Externí odkaz:
http://arxiv.org/abs/2402.04362
Autor:
Aceto, Luca, Fábregas, Ignacio, García-Pérez, Álvaro, Ingólfsdóttir, Anna, Ortega-Mallén, Yolanda
The nominal transition systems (NTSs) of Parrow et al. describe the operational semantics of nominal process calculi. We study NTSs in terms of the nominal residual transition systems (NRTSs) that we introduce. We provide rule formats for the specifi
Externí odkaz:
http://arxiv.org/abs/2402.00982
Eliciting Latent Knowledge (ELK) aims to find patterns in a capable neural network's activations that robustly track the true state of the world, especially in hard-to-verify cases where the model's output is untrusted. To further ELK research, we in
Externí odkaz:
http://arxiv.org/abs/2312.01037
Autor:
Zou, Andy, Phan, Long, Chen, Sarah, Campbell, James, Guo, Phillip, Ren, Richard, Pan, Alexander, Yin, Xuwang, Mazeika, Mantas, Dombrowski, Ann-Kathrin, Goel, Shashwat, Li, Nathaniel, Byun, Michael J., Wang, Zifan, Mallen, Alex, Basart, Steven, Koyejo, Sanmi, Song, Dawn, Fredrikson, Matt, Kolter, J. Zico, Hendrycks, Dan
In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience. RepE places population-level representatio
Externí odkaz:
http://arxiv.org/abs/2310.01405
Autor:
Mallen, Alex, Asai, Akari, Zhong, Victor, Das, Rajarshi, Khashabi, Daniel, Hajishirzi, Hannaneh
Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the limitations of relying solely on their parameters to encode a wealth of world knowledge. This pa
Externí odkaz:
http://arxiv.org/abs/2212.10511
Autor:
Terence W O'Neill, Robert Horne, Clare Jinks, Christian D Mallen, Zoe Paskins, Elaine Nicholls, Laurna Bullock, Stephanie Butler-Walley, Andrea Cherrington, Jane Fleming, Emma M Clark, Ida Bentley, Sarah Leyland, Cynthia P Iglesias-Urrutia, Simon Thomas, Jo Smith, David Webb, Sarah Lewis, Sarah Bathers, Michele Siciliano, Angela Clifford, Sarah Ryan, Joanne Protheroe, Nicky Dale, Janet Lefroy, Sarah Connacher, Ashley Hawarden
Publikováno v:
NIHR Open Research, Vol 4 (2024)
Background Good quality shared decision-making (SDM) conversations involve people with, or at risk of osteoporosis and clinicians collaborating to decide, where appropriate, which evidence-based medicines best fit the person’s life, beliefs, and va
Externí odkaz:
https://doaj.org/article/4a8d978b425241f8ad42b60aaffc7af5
In many scenarios, it is necessary to monitor a complex system via a time-series of observations and determine when anomalous exogenous events have occurred so that relevant actions can be taken. Determining whether current observations are abnormal
Externí odkaz:
http://arxiv.org/abs/2209.08618
Autor:
Delphine Kirkove, Sara Willems, Esther Van Poel, Nadia Dardenne, Anne-Françoise Donneau, Elodie Perrin, Cécile Ponsar, Christian Mallen, Neophytos Stylianou, Claire Collins, Rémi Gagnayre, Benoit Pétré
Publikováno v:
BMC Primary Care, Vol 24, Iss S1, Pp 1-12 (2024)
Abstract Background In response to the COVID-19 pandemic, the World Health Organization established a number of key recommendations such as educational activities especially within primary care practices (PCPs) which are a key component of this strat
Externí odkaz:
https://doaj.org/article/890c4f32029d48b792132404c973a3ae