Zobrazeno 1 - 10
of 1 946
pro vyhledávání: '"Kramar, P."'
Cells use signalling pathways as windows into the environment to gather information, transduce it into their interior, and use it to drive behaviours. MAPK (ERK) is a highly conserved signalling pathway in eukaryotes, directing multiple fundamental c
Externí odkaz:
http://arxiv.org/abs/2410.22571
Publikováno v:
PRX Life (2024) 2(3), 033005
Network-forming organisms, like fungi and slime molds, dynamically reorganize their networks during foraging. The resulting re-routing of resource flows within the organism's network can significantly impact local ecosystems. In current analysis limi
Externí odkaz:
http://arxiv.org/abs/2408.17134
Autor:
Lieberum, Tom, Rajamanoharan, Senthooran, Conmy, Arthur, Smith, Lewis, Sonnerat, Nicolas, Varma, Vikrant, Kramár, János, Dragan, Anca, Shah, Rohin, Nanda, Neel
Sparse autoencoders (SAEs) are an unsupervised method for learning a sparse decomposition of a neural network's latent representations into seemingly interpretable features. Despite recent excitement about their potential, research applications outsi
Externí odkaz:
http://arxiv.org/abs/2408.05147
Autor:
Rajamanoharan, Senthooran, Lieberum, Tom, Sonnerat, Nicolas, Conmy, Arthur, Varma, Vikrant, Kramár, János, Nanda, Neel
Sparse autoencoders (SAEs) are a promising unsupervised approach for identifying causally relevant and interpretable linear features in a language model's (LM) activations. To be useful for downstream tasks, SAEs need to decompose LM activations fait
Externí odkaz:
http://arxiv.org/abs/2407.14435
Autor:
Kenton, Zachary, Siegel, Noah Y., Kramár, János, Brown-Cohen, Jonah, Albanie, Samuel, Bulian, Jannis, Agarwal, Rishabh, Lindner, David, Tang, Yunhao, Goodman, Noah D., Shah, Rohin
Scalable oversight protocols aim to enable humans to accurately supervise superhuman AI. In this paper we study debate, where two AI's compete to convince a judge; consultancy, where a single AI tries to convince a judge that asks questions; and comp
Externí odkaz:
http://arxiv.org/abs/2407.04622
Autor:
Kramar, David, Krejcirik, David
We consider Dirac operators on the half-line, subject to generalised infinite-mass boundary conditions. We derive sufficient conditions which guarantee the stability of the spectrum against possibly non-self-adjoint potential perturbations and study
Externí odkaz:
http://arxiv.org/abs/2405.10009
Autor:
Rajamanoharan, Senthooran, Conmy, Arthur, Smith, Lewis, Lieberum, Tom, Varma, Vikrant, Kramár, János, Shah, Rohin, Nanda, Neel
Recent work has found that sparse autoencoders (SAEs) are an effective technique for unsupervised discovery of interpretable features in language models' (LMs) activations, by finding sparse, linear reconstructions of LM activations. We introduce the
Externí odkaz:
http://arxiv.org/abs/2404.16014
Activation Patching is a method of directly computing causal attributions of behavior to model components. However, applying it exhaustively requires a sweep with cost scaling linearly in the number of model components, which can be prohibitively exp
Externí odkaz:
http://arxiv.org/abs/2403.00745
Publikováno v:
Паёми Сино, Vol 26, Iss 4, Pp 685-693 (2024)
Arnold-Chiari malformation (ACM) is a developmental anomaly of the brain characterized by the descent of the cerebellar tonsils into the foramen magnum, leading to compression of the medulla oblongata and subsequent neurological symptoms. ACM can man
Externí odkaz:
https://doaj.org/article/61519c2817b046f4a581873d2c33219c
One of the most surprising puzzles in neural network generalisation is grokking: a network with perfect training accuracy but poor generalisation will, upon further training, transition to perfect generalisation. We propose that grokking occurs when
Externí odkaz:
http://arxiv.org/abs/2309.02390