Zobrazeno 1 - 10
of 53
pro vyhledávání: '"Chan, Stephanie C Y"'
Representation learning, and interpreting learned representations, are key areas of focus in machine learning and neuroscience. Both fields generally use representations as a means to understand or improve a system's computations. In this work, howev
Externí odkaz:
http://arxiv.org/abs/2405.05847
In-context learning is a powerful emergent ability in transformer models. Prior work in mechanistic interpretability has identified a circuit element that may be critical for in-context learning -- the induction head (IH), which performs a match-and-
Externí odkaz:
http://arxiv.org/abs/2404.07129
Autor:
SIMA Team, Raad, Maria Abi, Ahuja, Arun, Barros, Catarina, Besse, Frederic, Bolt, Andrew, Bolton, Adrian, Brownfield, Bethanie, Buttimore, Gavin, Cant, Max, Chakera, Sarah, Chan, Stephanie C. Y., Clune, Jeff, Collister, Adrian, Copeman, Vikki, Cullum, Alex, Dasgupta, Ishita, de Cesare, Dario, Di Trapani, Julia, Donchev, Yani, Dunleavy, Emma, Engelcke, Martin, Faulkner, Ryan, Garcia, Frankie, Gbadamosi, Charles, Gong, Zhitao, Gonzales, Lucy, Gupta, Kshitij, Gregor, Karol, Hallingstad, Arne Olav, Harley, Tim, Haves, Sam, Hill, Felix, Hirst, Ed, Hudson, Drew A., Hudson, Jony, Hughes-Fitt, Steph, Rezende, Danilo J., Jasarevic, Mimi, Kampis, Laura, Ke, Rosemary, Keck, Thomas, Kim, Junkyung, Knagg, Oscar, Kopparapu, Kavya, Lawton, Rory, Lampinen, Andrew, Legg, Shane, Lerchner, Alexander, Limont, Marjorie, Liu, Yulan, Loks-Thompson, Maria, Marino, Joseph, Cussons, Kathryn Martin, Matthey, Loic, Mcloughlin, Siobhan, Mendolicchio, Piermaria, Merzic, Hamza, Mitenkova, Anna, Moufarek, Alexandre, Oliveira, Valeria, Oliveira, Yanko, Openshaw, Hannah, Pan, Renke, Pappu, Aneesh, Platonov, Alex, Purkiss, Ollie, Reichert, David, Reid, John, Richemond, Pierre Harvey, Roberts, Tyson, Ruscoe, Giles, Elias, Jaume Sanchez, Sandars, Tasha, Sawyer, Daniel P., Scholtes, Tim, Simmons, Guy, Slater, Daniel, Soyer, Hubert, Strathmann, Heiko, Stys, Peter, Tam, Allison C., Teplyashin, Denis, Terzi, Tayfun, Vercelli, Davide, Vujatovic, Bojan, Wainwright, Marcus, Wang, Jane X., Wang, Zhengdong, Wierstra, Daan, Williams, Duncan, Wong, Nathaniel, York, Sarah, Young, Nick
Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order t
Externí odkaz:
http://arxiv.org/abs/2404.10179
Autor:
Singh, Aaditya K., Chan, Stephanie C. Y., Moskovitz, Ted, Grant, Erin, Saxe, Andrew M., Hill, Felix
Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL) despite not being explicitly trained for it. Prior work has provided a deeper understanding of how ICL emerges in transformers, e.g. through the lens of mecha
Externí odkaz:
http://arxiv.org/abs/2311.08360
What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However,
Externí odkaz:
http://arxiv.org/abs/2305.16183
Autor:
Hagendorff, Thilo, Dasgupta, Ishita, Binz, Marcel, Chan, Stephanie C. Y., Lampinen, Andrew, Wang, Jane X., Akata, Zeynep, Schulz, Eric
Large language models (LLMs) show increasingly advanced emergent capabilities and are being incorporated across various societal domains. Understanding their behavior and reasoning abilities therefore holds significant importance. We argue that a fru
Externí odkaz:
http://arxiv.org/abs/2303.13988
Autor:
Chan, Stephanie C. Y., Dasgupta, Ishita, Kim, Junkyung, Kumaran, Dharshan, Lampinen, Andrew K., Hill, Felix
Transformer models can use two fundamentally different kinds of information: information stored in weights during training, and information provided ``in-context'' at inference time. In this work, we show that transformers exhibit different inductive
Externí odkaz:
http://arxiv.org/abs/2210.05675
Autor:
Dasgupta, Ishita, Lampinen, Andrew K., Chan, Stephanie C. Y., Sheahan, Hannah R., Creswell, Antonia, Kumaran, Dharshan, McClelland, James L., Hill, Felix
Reasoning is a key ability for an intelligent system. Large language models (LMs) achieve above-chance performance on abstract reasoning tasks, but exhibit many imperfections. However, human abstract reasoning is also imperfect. For example, human re
Externí odkaz:
http://arxiv.org/abs/2207.07051
Autor:
Chan, Stephanie C. Y., Santoro, Adam, Lampinen, Andrew K., Wang, Jane X., Singh, Aaditya, Richemond, Pierre H., McClelland, Jay, Hill, Felix
Large transformer-based models are able to perform in-context few-shot learning, without being explicitly trained for it. This observation raises the question: what aspects of the training regime lead to this emergent behavior? Here, we show that thi
Externí odkaz:
http://arxiv.org/abs/2205.05055
Autor:
Tam, Allison C., Rabinowitz, Neil C., Lampinen, Andrew K., Roy, Nicholas A., Chan, Stephanie C. Y., Strouse, DJ, Wang, Jane X., Banino, Andrea, Hill, Felix
Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty
Externí odkaz:
http://arxiv.org/abs/2204.05080