Výsledky vyhledávání - "Chan, Stephanie C Y"

Report

Learned feature representations are biased by complexity, learning order, position, and more

Autor: Lampinen, Andrew Kyle, Chan, Stephanie C. Y., Hermann, Katherine

Representation learning, and interpreting learned representations, are key areas of focus in machine learning and neuroscience. Both fields generally use representations as a means to understand or improve a system's computations. In this work, howev

Externí odkaz: http://arxiv.org/abs/2405.05847

Zobrazit plný text záznamu

Report

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

Autor: Singh, Aaditya K., Moskovitz, Ted, Hill, Felix, Chan, Stephanie C. Y., Saxe, Andrew M.

In-context learning is a powerful emergent ability in transformer models. Prior work in mechanistic interpretability has identified a circuit element that may be critical for in-context learning -- the induction head (IH), which performs a match-and-

Externí odkaz: http://arxiv.org/abs/2404.07129

Zobrazit plný text záznamu

Report

Scaling Instructable Agents Across Many Simulated Worlds

Autor: SIMA Team, Raad, Maria Abi, Ahuja, Arun, Barros, Catarina, Besse, Frederic, Bolt, Andrew, Bolton, Adrian, Brownfield, Bethanie, Buttimore, Gavin, Cant, Max, Chakera, Sarah, Chan, Stephanie C. Y., Clune, Jeff, Collister, Adrian, Copeman, Vikki, Cullum, Alex, Dasgupta, Ishita, de Cesare, Dario, Di Trapani, Julia, Donchev, Yani, Dunleavy, Emma, Engelcke, Martin, Faulkner, Ryan, Garcia, Frankie, Gbadamosi, Charles, Gong, Zhitao, Gonzales, Lucy, Gupta, Kshitij, Gregor, Karol, Hallingstad, Arne Olav, Harley, Tim, Haves, Sam, Hill, Felix, Hirst, Ed, Hudson, Drew A., Hudson, Jony, Hughes-Fitt, Steph, Rezende, Danilo J., Jasarevic, Mimi, Kampis, Laura, Ke, Rosemary, Keck, Thomas, Kim, Junkyung, Knagg, Oscar, Kopparapu, Kavya, Lawton, Rory, Lampinen, Andrew, Legg, Shane, Lerchner, Alexander, Limont, Marjorie, Liu, Yulan, Loks-Thompson, Maria, Marino, Joseph, Cussons, Kathryn Martin, Matthey, Loic, Mcloughlin, Siobhan, Mendolicchio, Piermaria, Merzic, Hamza, Mitenkova, Anna, Moufarek, Alexandre, Oliveira, Valeria, Oliveira, Yanko, Openshaw, Hannah, Pan, Renke, Pappu, Aneesh, Platonov, Alex, Purkiss, Ollie, Reichert, David, Reid, John, Richemond, Pierre Harvey, Roberts, Tyson, Ruscoe, Giles, Elias, Jaume Sanchez, Sandars, Tasha, Sawyer, Daniel P., Scholtes, Tim, Simmons, Guy, Slater, Daniel, Soyer, Hubert, Strathmann, Heiko, Stys, Peter, Tam, Allison C., Teplyashin, Denis, Terzi, Tayfun, Vercelli, Davide, Vujatovic, Bojan, Wainwright, Marcus, Wang, Jane X., Wang, Zhengdong, Wierstra, Daan, Williams, Duncan, Wong, Nathaniel, York, Sarah, Young, Nick

Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order t

Externí odkaz: http://arxiv.org/abs/2404.10179

Zobrazit plný text záznamu

Report

The Transient Nature of Emergent In-Context Learning in Transformers

Autor: Singh, Aaditya K., Chan, Stephanie C. Y., Moskovitz, Ted, Grant, Erin, Saxe, Andrew M., Hill, Felix

Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL) despite not being explicitly trained for it. Prior work has provided a deeper understanding of how ICL emerges in transformers, e.g. through the lens of mecha

Externí odkaz: http://arxiv.org/abs/2311.08360

Zobrazit plný text záznamu

Report

Passive learning of active causal strategies in agents and language models

Autor: Lampinen, Andrew Kyle, Chan, Stephanie C Y, Dasgupta, Ishita, Nam, Andrew J, Wang, Jane X

What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However,

Externí odkaz: http://arxiv.org/abs/2305.16183

Zobrazit plný text záznamu

Report

Machine Psychology

Autor: Hagendorff, Thilo, Dasgupta, Ishita, Binz, Marcel, Chan, Stephanie C. Y., Lampinen, Andrew, Wang, Jane X., Akata, Zeynep, Schulz, Eric

Large language models (LLMs) show increasingly advanced emergent capabilities and are being incorporated across various societal domains. Understanding their behavior and reasoning abilities therefore holds significant importance. We argue that a fru

Externí odkaz: http://arxiv.org/abs/2303.13988

Zobrazit plný text záznamu

Report

Transformers generalize differently from information stored in context vs in weights

Autor: Chan, Stephanie C. Y., Dasgupta, Ishita, Kim, Junkyung, Kumaran, Dharshan, Lampinen, Andrew K., Hill, Felix

Transformer models can use two fundamentally different kinds of information: information stored in weights during training, and information provided ``in-context'' at inference time. In this work, we show that transformers exhibit different inductive

Externí odkaz: http://arxiv.org/abs/2210.05675

Zobrazit plný text záznamu

Report

Language models show human-like content effects on reasoning tasks

Autor: Dasgupta, Ishita, Lampinen, Andrew K., Chan, Stephanie C. Y., Sheahan, Hannah R., Creswell, Antonia, Kumaran, Dharshan, McClelland, James L., Hill, Felix

Reasoning is a key ability for an intelligent system. Large language models (LMs) achieve above-chance performance on abstract reasoning tasks, but exhibit many imperfections. However, human abstract reasoning is also imperfect. For example, human re

Externí odkaz: http://arxiv.org/abs/2207.07051

Zobrazit plný text záznamu

Report

Data Distributional Properties Drive Emergent In-Context Learning in Transformers

Autor: Chan, Stephanie C. Y., Santoro, Adam, Lampinen, Andrew K., Wang, Jane X., Singh, Aaditya, Richemond, Pierre H., McClelland, Jay, Hill, Felix

Large transformer-based models are able to perform in-context few-shot learning, without being explicitly trained for it. This observation raises the question: what aspects of the training regime lead to this emergent behavior? Here, we show that thi

Externí odkaz: http://arxiv.org/abs/2205.05055

Zobrazit plný text záznamu

Report

Semantic Exploration from Language Abstractions and Pretrained Representations

Autor: Tam, Allison C., Rabinowitz, Neil C., Lampinen, Andrew K., Roy, Nicholas A., Chan, Stephanie C. Y., Strouse, DJ, Wang, Jane X., Banino, Andrea, Hill, Felix

Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty

Externí odkaz: http://arxiv.org/abs/2204.05080

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání